Flaky, or non-deterministic, tests are a serious and prevalent problem in modern software development. An unreliable test suite with flaky tests wastes developers' time by triggering unnecessary test failure investigations that are not the result of their code changes, and delaying the integration of their code.

Often tests which report flaky results are not themselves unreliable, but caused by flawed production code or test infrastructure. Thus, it is important to periodically identify and fix the most severe flaky tests.

Gradle Enterprise provides Test Failure Analytics which gives you tools for quicker root cause analysis.

How flaky test detection works

Gradle Enterprise marks a test outcome as FLAKY if it fails and succeeds within the execution of a single Gradle task, Maven goal, or Bazel target. When this occurs, flaky tests analysis becomes available in Build Scans and in the Gradle Enterprise Tests Dashboard.

flaky test trend

This typically requires retrying failed tests, which is an industry-standard way to identify flaky tests.

Flaky test detection setup

Common test execution frameworks such as JUnit provide mechanisms for retrying tests, typically requiring extra code to annotate tests that are known to be flaky.

However, enabling test retry via your build does not require source code changes and applies to your entire test suite. Importantly, this allows you to analyze newly-introduced flaky tests in the Gradle Enterprise Tests Dashboard.

One other important aspect to consider is whether or not to fail the build when flaky tests are encountered. Historically, retry mechanisms have allowed builds to succeed. When enabling test retry through Gradle, it is possible to enable flaky test detection without silencing flaky failures. This comes at the cost of continuing developer disruptions, however; and should be considered carefully.

Gradle

Using the Test Distribution Gradle Plugin

The Gradle Enterprise Test Distribution Gradle Plugin integrates test retry for most test engines which are compatible with JUnit Platform, such as JUnit Jupiter, JUnit Vintage Spock, TestNG, Cucumber, jqwik, Spek, and others.

Tests do not need to be distributed in order to enable retry functionality.

See the Test Distribution Gradle Plugin User Manual for more information.

Using other Gradle test tasks

For JUnit 4, JUnit 5, Spock, and TestNG: Gradle recommends the use of the official Test Retry Gradle plugin to retry all tests for your entire test suite.

build.gradle.kts
plugins {
    id("org.gradle.test-retry") version "1.3.2"
}

tasks.withType<Test>().configureEach {
    retry {
        if (System.getenv().containsKey("CI")) {
            maxRetries.set(2)
            failOnPassedAfterRetry.set(true)
        }
    }
}

See the Test Retry Gradle plugin documentation and introductory blog post to learn about all of the useful features and configuration options.

Maven

The Maven Surefire and Failsafe plugins provide configuration properties which cause the test runner to retry each failing test a configured number of times.

Configuring these properties in your project causes failing tests to be rerun immediately after they fail. If a test passes and then fails, Gradle Enterprise will record a FLAKY outcome for the test.

pom.xml
<properties>
    <failsafe.rerunFailingTestsCount>2</failsafe.rerunFailingTestsCount>
    <surefire.rerunFailingTestsCount>2</surefire.rerunFailingTestsCount>
</properties>
Test retry works the same way when using test distribution with the Gradle Enterprise Maven Extension.

See the maven-surefire-plugin documentation and maven-failsafe-plugin documentation for compatibility and configuration details.

Bazel

Bazel provides a common flaky attribute to test rules which causes Bazel to rerun failing tests up to three times, equivalent to specifying --flaky_test_attempts=3 for test runs.

If any subsequent test execution passes after a failure, the test is marked as FLAKY, and the test target may succeed.

BUILD
java_test(
    name = "foo",
    flaky = True
)

See the Bazel user manual for more information.

Resources for flaky test analysis

With flaky test detection enabled, you will be able to identify the most severe flaky tests and their trends using Gradle Enterprise Test Failure Analytics.

Here are some resources which show you how to best leverage these tools: