Testing is an essential part of CI (Continuous International) to make software reliable and maintainable. The number of tests tends to increase monotonically and, in case of complicated software, it can take several hours to run tests. However, this kind of long-running tests can be a bottleneck of development process. For instance, in a Pull Request of GitHub, an author of the Pull Request might need to pass all tests for making sure that there is no fault in his/her code changes before the changes reviewed. Therefore, fast/smart testing is one of major interests in modern software engineering.
An earlier study [1] has reported that a part of tests is very likely to fail and those tests are generally “closer” to the code they change. On the other hand, “always passing” tests also exist among them. If we identify these tests ahead of time, we can make testing smarter by:
Therefore, the purpose of this repository is to create a logic to select the tests that are likely to fail and investigate its feasibility/efficiency. Since the GitHub repository has a history of commits and GitHub Actions holds build/test logs, such a logic can be built based on features extracted from those logs, e.g.,
Apache Spark is a parallel and distributed analytics framework for large-scale data processing. More than 10 years have already passed since the OSS community started to develop the Spark codebase in GitHub and so it has too many tests exist there (the command results below show that it has around 1800 Scala tests). A GitHub Actions script runs those tests in parallel, but it takes 2-3 hours to finish it.
$ pwd
/tmp/spark-master
$ git rev-parse --short HEAD
38d39812c1
$ find . -type f | grep -e "\/.*Suite\.class$" | wc -l
1758
The number of test result logs collected via the GitHub APIs is 553 for the past 3-4 months in GitHub Actions workflow runs and they include 1046 valid test failure cases. With reference to an previous report [2], our current model uses the following information:
updated_num_3d
, updated_num_14d
, and updated_num_15d
, respectively).num_adds
, num_dels
, and num_chgs
, respectively).file_card
).failed_num_7d
, failed_num_14d
, and failed_num_28d
, respectively) and the total counttotal_failed_num
).path_difference
). Let's say that we have two files: their file paths aredistance
) and the graph will be described soon after.total_failed_num * num_commits
and failed_num_7d * num_commits
.NOTE: A data lineage for extracting the information from the test result logs is as following (the diagram below is generated with spark-sql-flow-plugin; the bottom flow in the lineage represents how to transform data for model training and the top flow represents it for model validation):
Moreover, to prune test unrelated to code changes, we use two relations as follows:
A figure below shows recall values of the current our model. Each value means the empirical probability of catching an individual failure; for instance, there is more than 90% chance that our model can catch a failed test when running 600 tests (around 35% of all the tests). Note that why the recall value does not reach to 1.0 is that the pruning strategy removes away failed tests incorrectly before computing failure probabilities.
You can get the tests that are likely to fail based on your code changes as follows:
$ echo $SPARK_REPO
/tmp/spark-master
$ git -C $SPARK_REPO diff HEAD~3 --stat
core/src/main/java/org/apache/spark/SparkThrowable.java | 9 ++++++++-
core/src/main/java/org/apache/spark/memory/SparkOutOfMemoryError.java | 4 ----
core/src/main/resources/error/README.md | 19 ++++++++++---------
core/src/main/resources/error/error-classes.json | 19 +++----------------
core/src/main/scala/org/apache/spark/ErrorInfo.scala | 4 ++++
core/src/main/scala/org/apache/spark/SparkException.scala | 20 --------------------
core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala | 17 +++++++++++++++++
python/docs/source/reference/pyspark.sql.rst | 1 +
python/pyspark/sql/functions.py | 35 +++++++++++++++++++++++++++++++++++
sql/catalyst/src/main/scala/org/apache/spark/sql/AnalysisException.scala | 1 -
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala | 25 ++++++++++++++-----------
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala | 2 +-
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonFileFormat.scala | 2 --
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/json/JsonTable.scala | 2 --
sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala | 19 -------------------
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/CommonFileDataSourceSuite.scala | 2 +-
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala | 21 ++++++++++++++++++++-
17 files changed, 114 insertions(+), 88 deletions(-)
# Print the top 12 tests having the highest failure probabilities for the code changes above
# (3 commits originating from 'HEAD').
#
# NOTE: a script 'predict-spark-test.sh' creates a 'conda' virtual env to install required modules.
# If you install them by yourself (e.g., pip install -r bin/requirements.txt),
# you need to define a env 'CONDA_DISABLED' like 'CONDA_DISABLED=1 ./bin/predict-spark-tests.sh ...'
$ ./bin/predict-spark-tests.sh --num-commits 3 --num-selected-tests 12
[
"org.apache.spark.SparkThrowableSuite",
"org.apache.spark.sql.SQLQuerySuite",
"org.apache.spark.sql.execution.datasources.json.JsonSuite",
"org.apache.spark.sql.DataFrameSuite",
"org.apache.spark.sql.hive.orc.HiveOrcSourceSuite",
"org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite",
"org.apache.spark.sql.SQLQueryTestSuite",
"org.apache.spark.sql.TPCDSQueryTestSuite",
"org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite",
"org.apache.spark.sql.execution.CoalesceShufflePartitionsSuite",
"org.apache.spark.sql.hive.MultiDatabaseSuite",
"org.apache.spark.sql.execution.adaptive.AdaptiveQueryExecSuite"
]
# '--format' option makes the output format follow ScalaTest one for running specific tests
$ ./bin/predict-spark-tests.sh --num-commits 3 --num-selected-tests 12 --format > $SPARK_REPO/selected_tests.txt
$ cd $SPARK_REPO && ./build/mvn clean test -DtestsFiles=selected_tests.txt
...
If you hit some bugs or have requests, please leave some comments on Issues or Twitter (@maropu).