The Gradle Enterprise build cache follows a simple principle: the best way to do work faster is to avoid doing it at all. While Maven does not provide support for incremental builds, the Gradle Enterprise build cache allows you to reuse outputs of goal executions from any previous build. Thus, it avoids executing costly goals and accelerates your Maven builds significantly.
The remote build cache takes this even one step further: it allows you to share cached outputs across your whole team, including local and CI builds. In the diagram below, you can see the flow of CI agents pushing to the remote cache, and developers pulling from the remote cache.
A goal is executed on a CI server. The build is configured to push to a configured remote build cache, so that outputs can be reused by other CI pipeline builds and developer builds.
A developer executes the same goal with a local change to a file. The Gradle Enterprise Maven extension tries to load the output from the local build cache, then the remote build cache. Neither contains a matching entry due to the local change, so the goal is executed. The output is stored in the local build cache. Outputs stored in the local build cache can be reused in subsequent builds on that developer’s machine.
A second developer executes that goal without any local changes from the commit that CI built. This time the remote build cache lookup is a hit, the cached output is downloaded and directly copied to the workspace and the local build cache. The goal does not need to be executed.
This guide will show you how to get started with the the Maven build cache provided by the Gradle Enterprise Maven Extension. The intended audience is build engineers who are looking to enable it for their existing builds. After you’ve seen the build cache in action, this guide will explain the basic concepts that are important to understand how the build cache works. You’ll learn how to measure the effectiveness of the build cache for your build and how to diagnose and solve common problems. Last but not least, this guide outlines how to roll out the build cache in your organization.
In order to enable build caching for your Maven project, you need to add the Gradle Enterprise Maven Extension to your build. For this purpose, create
.mvn/extensions.xml with the following content in the project root directory:
<extensions> <extension> <groupId>com.gradle</groupId> <artifactId>gradle-enterprise-maven-extension</artifactId> <version>1.0.8</version> </extension> </extensions>
In addition, you need to configure the Gradle Enterprise server in
gradle-enterprise.xml. There are multiple locations for this file that allow you to configure settings for your Maven installation, your project, or your local user (cf. user manual). When getting started, it’s usually easiest if you add the configuration to the current project in
<gradleEnterprise> <server> <url>https://gradle.company.com</url> </server> </gradleEnterprise>
Once you’ve done that, you’re ready to run your first Maven build that uses the build cache.
$ mvn clean verify ... [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 3.276 s [INFO] Finished at: 2019-03-15T16:06:09+01:00 [INFO] ------------------------------------------------------------------------ [INFO] 7 goals, 7 executed [INFO] [INFO] Publishing build scan... [INFO] https://gradle.company.com/s/vcmc35bl4dd2w [INFO]
Be sure to include the
As you can see from the summary line, all goals were executed and none were loaded from cache. That’s not surprising since the build started with an empty cache. The resulting build scan provides a summary of all cache operations in the Build cache section on the Performance page.
In order to get the most out of the build cache, it is important to understand the basic concepts of how it works.
The outputs of a goal are the files it produces when executed. Its inputs are all files and properties that influence its outputs. For example, for the
compile goal of the maven-compiler-plugin all Java source files in
src/main/java are inputs as well as all configuration options (such as compiler flags) that influence the resulting class files, i.e. its outputs.
Artifacts in the build cache are uniquely identified by a build cache key. A build cache key is assigned to each cacheable goal execution when running with the build cache enabled and is used for both loading and storing outputs of goal executions to the build cache. The following inputs contribute to the build cache key for a goal execution: the goal implementation class and its classpath, the names and values of its inputs, and the names of its output properties. Two goal executions can reuse their outputs by using the build cache if their associated build cache keys are the same.
A goal execution is said to have reproducible outputs if it will always generate the same outputs given the same inputs. Some goals add extra information to their output that doesn’t depend their its inputs, e.g. a code generator might add a timestamp to the generated files. In such a case, re-executing the goal will result in different outputs. Consequently, goals that use these outputs as their inputs will need to be re-executed.
When a goal is cacheable the very nature of goal output caching ensures that its executions will have the same outputs for a given set of inputs. Therefore, cacheable goals should have reproducible outputs. Otherwise, the result of executing the goal and loading its outputs from cache may be different, which can lead to hard-to-diagnose cache misses.
The outputs of a goal can only be loaded from cache if it has stable inputs. Unstable inputs result in frequent, unnecessary cache misses. Goals frequently depend on outputs of other goals as their input. For example, compiling tests depends on the result of compiling production code. Thus, in order for goals to have stable inputs, the goals they depend on should have reproducible outputs.
While we acknowledge that creating outputs that contain volatility (such as build timestamps) is a common practice for Maven builds, we see this as an antipattern because they drastically reduce the probability of cache hits. If, for example, one project in a multi-project build generates a build timestamp, the build cache has to assume that this timestamp is used by downstream projects. Therefore, all of them have to be rebuilt even if the timestamp is not actually used.
Timestamps in particular are of dubious significance. What does a timestamp tell us about the origin of the artifact? Does it help us to track it back to the CI job that created it? The answer is almost always "no", because the time the CI job was started is usually not identical with the timestamp the artifact was generated. Instead, you should use something that uniquely identifies the code that was used to produce the artifact. A good candidate for this is the SCM revision number or commit ID.
Having stable inputs is crucial for cacheable goals. However, achieving byte for byte identical inputs for each goal can be challenging. Sanitizing the output of a goal to remove unnecessary information is often a good approach, but sometimes it’s impossible to remove all volatility.
This is where input normalization comes into play. Input normalization is used to determine if two goal inputs are essentially the same. The extension uses normalized inputs when determining if a cached result can be re-used instead of executing the goal, e.g. by only considering the paths of input files relative to the project directory.
The build cache understands the concept of a runtime classpath, and uses tailored input normalization to avoid running e.g. tests. For jar files on runtime classpaths, file timestamps and the order of the entries are ignored. This means that a rebuilt jar file would be considered the same runtime classpath input.
Your classpaths may contain files that are not relevant for running or testing your code. A typical example are property files containing the current time, an SCM revision number, or commit ID. If left unchecked, such property files trigger a rerun of your tests on every build because the extension needs to assume that your code is making decision based on the contents of these files. By default, the extension ignores the contents of
pom.properties in all subfolders of
META-INF/maven/ on the classpath. You can configure additional files to be ignored in your
pom.xml. Please refer to the Normalization section of the extension user manual for details.
The Java compiler only considers the signatures of the classes on the classpath. The extension uses this knowledge to avoid recompiling your sources when only an implementation detail on the classpath has changed.
However, if there are annotation processors on the classpath, the extension needs to consider all implementation details, because annotation processors are executed during compilation. This disables compile avoidance and lowers your cache hit ratio. The extension will detect this and issue a build warning:
[WARNING] The following annotation processors were found on the classpath: [com.acme.SomeAnnotationProcessor]. This significantly reduces the cache hit ratio. Please use the <annotationProcessorPaths> configuration element of the compiler plugin to declare the processors instead. If you did not intend to use the processors above (e.g. they were leaked by a dependency), you can use the <proc>none</proc> option to disable annotation processing.
To fix this, please declare your annotation processors explicitly using the compiler plugin’s
<annotationProcessorPaths> configuration. If you don’t want to use annotation processors at all (and they are only on your classpath by accident), you can use the
<proc>none</proc> option to tell the compiler and the extension that these processors should be ignored.
The extension tracks all known inputs and outputs of the supported goals. Sometimes your goals may read additional inputs or produce additional outputs. For example, your integration tests might read files from the non-standard
src/test/samples folder. Or an annotation processor might generate an SQL schema to the non-standard location
target/schema. In order for caching to work correctly, you need to specify these additional inputs and outputs.
We’ve talked quite a bit about cacheable goals, which implies there are non-cacheable ones, too. Maven goals do not declare their inputs and outputs so there is no generic way of making them cacheable. The extension supports a set of well-known goals, e.g. the
testCompile goals of the maven-compiler-plugin (see the extension user manual for the full list of supported plugins and goals).
Sometimes you may have goals that do things that can’t be cached. For example, you may have a systems test that depends on the state of an external system which can’t be tracked as an input. In that case, you need to disable build caching for that particular plugin or goal execution.
Now that we understand the most important concepts of build caching, let us walk through an example of how to measure and improve cache effectiveness. We will be using the unstable-inputs-example project to illustrate the steps. We will run several scenarios in order to find potential causes of cache misses. We recommend that you run your project through the same set of scenarios before rolling out the cache in your organization. This will ensure a high cache hit ratio from the start.
Let’s start off by running the build for the first time:
mvn clean verify
When a goal is executed, the extension first checks the local build cache for stored build results that may be reused. If no result is found in the local build cache, the remote build cache is queried. If neither provides a result, the goal is executed and the outputs of all cacheable goals are stored in the local build cache. Since we just activated the build cache for the project, the local build cache as well as the remote build cache are empty and all goals are executed. This is reflected in the corresponding build scan’s goal execution page.
By clicking on the executed goals, we can get more details about them in the timeline view.
When we run the build a second time with a populated local cache, the build results of cacheable goals should be retrieved from the cache. However, some supported goals were not cacheable. By taking a closer look at the timeline, we can see the reason was undeclared inputs.
The extension automatically checks all command line arguments of cacheable goals for well-known paths that represent undeclared inputs and outputs. For example, in the above build scan, all executions of the
surefire:test goal are not cacheable because they pass the
src/test/samples directory using a system property.
In order to remedy the situation, you should declare the directory as an additional input for executions of the maven-surefire-plugin.
<plugin> <groupId>com.gradle</groupId> <artifactId>gradle-enterprise-maven-extension</artifactId> <version>1.0.8</version> <configuration> <gradleEnterprise> <plugins> <plugin> <artifactId>maven-surefire-plugin</artifactId> <inputs> <fileSets> <fileSet> <name>samples</name> <paths> <path>src/test/samples</path> </paths> </fileSet> </fileSets> </inputs> </plugin> </plugins> </gradleEnterprise> </configuration> </plugin>
When we run the build another time, the build results of cacheable goals should be retrieved from the cache. However, some cacheable goals were executed again, telling us that they must have unstable inputs. We’ll need to find and fix those.
In order to identify which inputs changed between builds, we can use the
mvn -q -Dorg.slf4j.simpleLogger.log.gradle.goal.fingerprint=trace clean verify | tee log1.txt mvn -q -Dorg.slf4j.simpleLogger.log.gradle.goal.fingerprint=trace clean verify | tee log2.txt
This writes all the normalized inputs to files that we can then compare using the
diff log1.txt log2.txt 67c67 < [TRACE] Cache key: a7dafb9be698e4b2c3978690cf9a1853 --- > [TRACE] Cache key: 22822d129631bde1bef50f3bb8dc1338 97,98c97,98 < [TRACE] Fingerprint for input file property classesDirectory, using CLASSPATH strategy: 534e939040586d21febbc60f263159e6 < [TRACE] - <projectDir>/target/classes/build.properties (normalized to 'build.properties'): abdb0f3a64d2dc7ca14752b3b6ff5875 --- > [TRACE] Fingerprint for input file property classesDirectory, using CLASSPATH strategy: 82e047d840f88737d2fd0b45ff09c921 > [TRACE] - <projectDir>/target/classes/build.properties (normalized to 'build.properties'): 66f666ebbbda85cf856c6473ee9e4f3d
The diff shows us that the
build.properties file is the culprit. We have several options to stabilize this input. We could decide to completely remove the timestamp property, since it is probably serving no important purpose in our application. We could move the timestamp generation to a Maven profile that is only used on release builds, so the timestamp no longer affects day-to-day development. Or we can use Normalization to ignore the changing file for the purposes of cache key calculation.
After employing one of these fixes, we get the expected number of cache hits when running
mvn clean verify again.
We have compiled a list of common causes for cache misses and their solutions in the reference manual.
Next, let’s do a small implementation change in the
api project by making the
getTheAnswer method return
43 instead of
42. When we run the build, the compile goal for the
api project is rerun, but the
impl project is not recompiled. This is thanks to the Compile avoidance feature explained earlier. The tests of both
impl are rerun, since they could be affected by the change in behavior. The
unrelated project gets all its outputs from the local cache, as it does not depend on
api. The build fails as expected, since the changed behavior no longer matches the test expectations.
If we add a new public method to the
Api class, both the
impl project are recompiled and retested. The
unrelated project on the other hand gets its outputs from the local cache again, as it does not depend on
Last but not least, we need to ensure that the cache works even across machine boundaries. First, we’ll need to push some outputs to the remote cache, so we can use them from another machine
mvn clean verify -Dgradle.cache.local.enabled=false -Dgradle.cache.remote.store.enabled=true
Now we can log into another machine, maybe even one using another operating system, and check out the same commit of our project.
Building it using
mvn clean verify -Dgradle.cache.local.enabled=false
should retrieve all goal outputs from the remote cache. For our example project, this works well after the fixes we did earlier. If your project does not retrieve its outputs from the remote cache, follow the steps above to find the changing inputs. Likely candidates include the Java version and Maven version used on each machine. Both are an input to every goal.
This chapter will show you how you can adjust the extension’s settings to do a safe, staged roll-out of caching throughout your organization.
Once you’ve verified cache effectiveness on your own machine, you’ll probably want to allow a few other colleagues to try it out, without affecting everyone else on the team. You can do this by disabling the cache in the project’s
<gradleEnterprise> <buildCache> <local> <enabled>false</enabled> </local> <remote> <enabled>false</enabled> </remote> </buildCache> </gradleEnterprise>
and then letting your early adopters re-enable it in their user home
<gradleEnterprise> <buildCache> <local> <enabled>true</enabled> </local> <remote> <enabled>true</enabled> </remote> </buildCache> </gradleEnterprise>
Make sure that your early adopters are seeing the same local build cache hit ratio that you had in your own experiments.
For your CI builds, changing settings in the user home is probably not an option, as that may affect other projects. Instead, you can put your CI settings into a custom file in your project, e.g.
.mvn/gradle-enterprise-ci.xml. You can use the
command line argument to enable that custom configuration for just the builds that you want. At first you may want to do this for a dedicated test pipeline, until you have convinced yourself that caching works well enough to roll it out to your main pipeline.
You can also use this configuration file to enable storing in the remote build cache, so that later builds can benefit from the outputs that your CI agents created.
<gradleEnterprise> <buildCache> <local> <enabled>true</enabled> </local> <remote> <enabled>true</enabled> <storeEnabled>true</storeEnabled> </remote> </buildCache> </gradleEnterprise>
We strongly recommend letting local developers only load from the remote cache and letting your CI servers store results in the remote cache. For this reason, storing in the remote build cache is disabled by default and has to be explicitly enabled.
Your CI builds will now populate the remote build cache. Your local builds should now get cache hits whenever they execute a goal that has already been executed on CI with the same inputs. Make sure this works well for all your developers.
Many projects have a pipeline with multiple stages, with many steps running in parallel. In order to get the most out of the build cache, we recommend running
mvn clean package -Dmaven.test.skip.exec=true
as your first pipeline stage, so that all subsequent stages can reuse the compiled production and test code.
If you are using ephemeral CI agents, the local build cache will not give you any benefit, since it disappears together with the build agent. You can disable it to save some build time in this case.
The effectiveness of using a remote build cache is largely dictated by the network latency between the build and the cache. Gradle Enterprise provides a built-in cache node at
https://gradle.company.com/cache. This is the node where outputs will be stored and loaded by default. You can install additional nodes and connect them with Gradle Enterprise. See the Build Cache Node User Manual for more details. Using a build cache node that is closer to where the builds are run can significantly reduce build times.
Each team member should configure the closest node in their user home
<gradleEnterprise> <server> <url>https://gradle.company.com</url> </server> <buildCache> <remote> <server> <url>https://my-cache/cache/</url> </server> </remote> </buildCache> </gradleEnterprise>
This guide has introduced the Gradle Enterprise build cache for Maven and explained the underlying concepts. You should now have the knowledge to adapt your own build so it can make effective use of the build cache. In addition, you have learned how to roll out the build cache in your organization. Please refer to extension user manual for a reference of all available configuration options.
Be aware that your journey does not end here. As with any performance optimization, it’s an ongoing process, not an event. You should invest into keeping your build well behaved and check regularly that you are still making effective use of the build cache. Build scans are an essential tool for keeping builds fast. You can learn more about them in the getting started guide.