Build Scan Storage


Build Scan data storage management is critical for ensuring the long-term availability of build data while controlling disk usage.

Build Scans can be stored in the Develocity database, or in an object store. By default, Build Scan data is stored in the database that Develocity is configured to connect to.

If you decide to store Build Scan data in an object store, Develocity must be connected to an external object store. Build Scan data can’t be stored in the Develocity-provided embedded object storage.

Disk Space Management

In most installations, storage space usage is dominated by the storage of Build Scan data. The amount of space used is dependent on how many scans are being published and how much information is being recorded for each build. When Build Scan data is stored in the database, compression and deduplication techniques are used. This means that data growth is non-linear; storing data for twice as many builds doesn’t mean that twice the space will be required. When Build Scan data is stored in an object store, each scan is stored as a compressed, self-contained object. This makes estimation of required storage easier: twice as many scans results in roughly twice as much storage being used.

To control the amount of storage space used, we recommend that you configure Develocity to remove Build Scan data based on their age. To avoid running out of disk space, configure automatic deletion of old Build Scan data when the amount of free space drops below a specified percentage, as well as automatic rejection of incoming data when free space drops below a specified percentage. Additionally, the system can be configured to send warning emails when free space drops below a specified percentage.

These settings can be changed by going to “Administration” via the top right user menu, then “Build Scans”. They can also be configured through unattended configuration or inline in your Helm values. See the configuration reference for examples.

A configuration that maintains a predictable Build Scan retention period is:

  1. Specify a maximum Build Scan age

  2. Send a warning email when there is less than 10% of space free

  3. Reject incoming data when there is less than 5% of space free

When storing scans in the database, an alternative configuration that stores as much Build Scan history as space permits is:

  1. Don’t specify a maximum Build Scan age

  2. Automatically delete old Build Scans when there is less than 15% of space free

  3. Send a warning email when there is less than 10% of space free

  4. Reject incoming data when there is less than 5% of space free

When enabling automatic deletion of old Build Scans when disk space is low, be mindful that the result of another process filling the volume that Develocity is using will be that all Build Scan data will be deleted.

If backups are created on the same volume, make sure to leave enough room for them in your thresholds. For example, if the total space that your backups take up is 40% of the disk, the above recommended settings would be 55%, 50% and 45%. Store backups on a separate volume to simplify space management.

Configuring disk space management percentage thresholds is currently only supported when using the embedded database. Please consult your database provider for monitoring and alerts on database disk space when using a user-managed database.

When storing Build Scan data in object storage, the space used won’t be taken into account in the above calculations. This means that if your installation is running low on database space, the likely cause won’t be Build Scan data, so automatic deletion of Build Scan data won’t recover much space. We don’t recommend setting an auto-deletion threshold when storing Build Scan data in an object store as described below.

Build Scan Object Storage

It’s possible to store Build Scan data in an object store, such as Amazon S3, Google Cloud Platform, and Microsoft Azure. Using object storage is a trade-off. There are several benefits:

  • High traffic installations typically see an improvement in Build Scan processing and cleanup performance.

  • Cloud-based object storage services are often highly scalable and fault-tolerant, typically more so than an individual database.

  • It’s usually cheaper to store more Build Scans.

  • With the vast majority of Build Scan data in object storage, the main Develocity database typically requires much less storage, and the installation (for embedded databases) or the user-managed database can be provisioned with fewer resources.

  • Database backup management is easier when the database is much smaller, and backups themselves take up less space and are cheaper to store.

However, there are also downsides and limitations:

  • Installation is more complex.

  • More memory must be allocated to the Develocity application (2GB is the increase we recommend).

  • The persistent data of Develocity is no longer fully contained within a database backup, because Build Scan data is no longer stored in the database. This makes it more complicated to clone a Develocity instance.

  • Your Develocity installation should be hosted as close as possible to your object storage service. Host your Develocity instance on AWS if you’re using Amazon S3, GCP if you’re using Google Cloud Storage, or Azure if you’re using Azure Blob Storage, etc. Within the same cloud provider’s services, it should also be hosted within the same geographic region.

See the Kubernetes Helm Chart Configuration Guide or Standalone Helm Chart Configuration Guide for object storage configuration details.

There are scenarios where you must specify a custom endpoint URL of your object storage service:

  • If you need to connect to S3 directly from a VPC using a gateway VPC endpoint

  • If your object storage service isn’t S3 but provides an S3-compatible interface

Build Scan Storage Configuration

Once object storage is configured in your Helm chart, perform the following steps to store incoming Build Scan data in the store:

  • Go to Administration  Build Scans

  • Select Store incoming Build Scans in object storage

  • Click Save

  • You will be prompted to restart to allow the configuration changes to be applied.

When using Build Scan object storage, Develocity requires more memory than when using the database. You should increase Develocity’s memory requests and limits, see the configuration reference

If you manually write a Develocity configuration file to do an unattended installation, see the configuration reference for examples.

If you are using EKS service account credentials, ensure that you have configured Helm with the necessary service account annotation and redeployed.

Once Develocity has restarted, all new Build Scan data will be stored in the configured object store. Any existing Build Scan will be loaded from the database, and will eventually be evicted according to the configured age-based retention period. If one of your reasons for adopting object storage was to reduce the size of your database, note that you will only start to see the size of your database reducing once Build Scan data starts aging out after the retention window. To reduce the size of the database, Build Scan data needs to be deleted, and the space it took up needs to be reclaimed - a process which happens regularly, both daily and weekly.

Build Scans Object Storage Develocity Configuration Reference

To configure Develocity to store incoming Build Scan data in object storage, add the following snippet to your configuration file:

unattended-config.yaml
buildScans:
  incomingStorageType: objectStorage

You should increase Develocity’s memory requests and limits by 2Gi; by default, this is set to 6Gi:

values.yaml
enterprise:
  resources:
    requests:
      memory: 8Gi (1)
    limits:
      memory: 8Gi (1)
1 If you have already set a custom value here, instead increase it by 2Gi.

If you are using EKS service account based credentials, ensure that you have configured Helm with the necessary service account annotation and redeployed.

Deprecated S3 Bucket Prefix Configuration

Releases prior to 2023.2 allowed specifying a prefix to use within a bucket when storing Build Scan data. This is now deprecated, and all Build Scan data will be stored under the prefix build-scans, which was the previous default value used if not specified.

Installations that store significant Build Scan data under a custom prefix can add that prefix as an advanced app parameter by adding the following snippet to your configuration file:

unattended-config.yaml
advanced:
  app:
    params:
      buildScanStorage.prefix: my/prefix

This override will be removed in a future release. Administrators using a custom prefix are encouraged to migrate their data to the standard prefix and remove the override. See Migrating Build Scan Data to the Standard Prefix for the migration procedure.