BACKUP

On this page

Warning:

As of May 16, 2023, CockroachDB v21.2 is no longer supported. For more details, refer to the Release Support Policy.

CockroachDB's BACKUP statement allows you to create full or incremental backups of your cluster's schema and data that are consistent as of a given timestamp.

You can backup a full cluster, which includes:

Relevant system tables
All databases
All tables (which automatically includes their indexes)
All views
All scheduled jobs

You can also backup:

An individual database, which includes all of its tables and views.
An individual table, which includes its indexes and views.

BACKUP only backs up entire tables; it does not support backing up subsets of a table.

Because CockroachDB is designed with high fault tolerance, these backups are designed primarily for disaster recovery (i.e., if your cluster loses a majority of its nodes) through RESTORE. Isolated issues (such as small-scale node outages) do not require any intervention.

Note:

The BACKUP ... TO and RESTORE ... FROM syntax is deprecated as of v22.1 and will be removed in a future release.

We recommend using the BACKUP ... INTO {collectionURI} syntax, which creates or adds to a backup collection in your storage location. For restoring backups, we recommend using RESTORE FROM {backup} IN {collectionURI} with {backup} being LATEST or a specific subdirectory.

For guidance on the syntax for backups and restores, see the BACKUP and RESTORE examples.

Considerations

Core users can only take full backups. To use the other backup features, you need an Enterprise license. You can also use CockroachDB Dedicated, which runs full backups daily and incremental backups hourly.
BACKUP is a blocking statement. To run a backup job asynchronously, use the DETACHED option. See the options below.
Backups will export Enterprise license keys during a full cluster backup. When you restore a full cluster with an Enterprise license, it will restore the Enterprise license of the cluster you are restoring from.
Zone configurations present on the destination cluster prior to a restore will be overwritten during a cluster restore with the zone configurations from the backed up cluster. If there were no customized zone configurations on the cluster when the backup was taken, then after the restore the destination cluster will use the zone configuration from the RANGE DEFAULT configuration.
You cannot restore a backup of a multi-region database into a single-region database.
While Cockroach Labs actively tests Amazon S3, Google Cloud Storage, and Azure Storage, we do not test S3-compatible services (e.g., MinIO, Red Hat Ceph).

Tip:

To view the contents of a backup created with the BACKUP statement, use SHOW BACKUP.

Required privileges

Full cluster backups can only be run by members of the admin role. By default, the root user belongs to the admin role.
For all other backups, the user must have read access on all objects being backed up. Database backups require CONNECT privileges, and table backups require SELECT privileges. Backups of user-defined schemas, or backups containing user-defined types, require USAGE privileges.
BACKUP requires full read and write (including delete and overwrite) permissions to its target destination.

Destination privileges

The destination file URL does not require the admin role in the following scenarios:

S3 and GS using SPECIFIED (and not IMPLICIT) credentials. Azure is always SPECIFIED by default.
Userfile

The destination file URL does require the admin role in the following scenarios:

S3 or GS using IMPLICIT credentials
Use of a custom endpoint on S3
Nodelocal

We recommend using cloud storage for bulk operations.

Warning:

While Cockroach Labs actively tests Amazon S3, Google Cloud Storage, and Azure Storage, we do not test S3-compatible services (e.g., MinIO, Red Hat Ceph).

Synopsis

Parameters

Parameter	Description
`targets`	Back up the listed targets.
`subdirectory`	The name of the specific backup (e.g., `2021/03/23-213101.37`) in the collection to which you want to add an incremental backup. To view available backup subdirectories, use `SHOW BACKUPS IN destination`. If the backup `subdirectory` is not provided, a full backup will be created in the collection using a date-based naming scheme (i.e., `<year>/<month>/<day>-<timestamp>`). Warning: If you use an arbitrary `STRING` as the subdirectory, a new full backup will be created, but it will never be shown in `SHOW BACKUPS IN`. We do not recommend using arbitrary strings as subdirectory names.
`LATEST`	Append an incremental backup to the latest completed full backup's subdirectory.
`destination`	The URL where you want to store the backup. For information about this URL structure, see Backup File URLs.
`timestamp`	Back up data as it existed as of `timestamp`. The `timestamp` must be more recent than your data's garbage collection TTL (which is controlled by the `gc.ttlseconds` replication zone variable).
`backup_options`	Control the backup behavior with a comma-separated list of these options.

Targets

Target	Description
N/A	Backup the cluster. For an example of a full cluster backup, see Backup a cluster.
`DATABASE {database_name} [, ...]`	The name of the database(s) you want to backup (i.e., create backups of all tables and views in the database). For an example of backing up a database, see Backup a database.
`TABLE {table_name} [, ...]`	The name of the table(s) or view(s) you want to backup. For an example of backing up a table or view, see Backup a table or view.

Options

Option	Value	Description
`revision_history`	N/A	Create a backup with full revision history, which records every change made to the cluster within the garbage collection period leading up to and including the given timestamp.
`encryption_passphrase`	`STRING`	The passphrase used to encrypt the files (`BACKUP` manifest and data files) that the `BACKUP` statement generates. This same passphrase is needed to decrypt the file when it is used to restore and to list the contents of the backup when using `SHOW BACKUP`. There is no practical limit on the length of the passphrase.
`DETACHED`	N/A	When a backup runs in `DETACHED` mode, it will execute asynchronously. The job ID will be returned after the backup job creation completes. Note that with `DETACHED` specified, further job information and the job completion status will not be returned. For more on the differences between the returned job data, see the example below. To check on the job status, use the `SHOW JOBS` statement. To run a backup within a transaction, use the `DETACHED` option.
`kms`	`STRING`	The key management service (KMS) URI (or a comma-separated list of URIs) used to encrypt the files (`BACKUP` manifest and data files) that the `BACKUP` statement generates. This same KMS URI is needed to decrypt the file when it is used to restore and to list the contents of the backup when using `SHOW BACKUP`. Currently, only AWS KMS is supported.

Backup file URLs

CockroachDB uses the URL provided to construct a secure API call to the service you specify. The URL structure depends on the type of file storage you are using. For more information, see the following:

Tip:

Backups support Amazon S3 storage classes. For more detail, see Additional cloud storage feature support.

Functional details

Object dependencies

Dependent objects must be backed up at the same time as the objects they depend on.

Object	Depends On
Table with foreign key constraints	The table it `REFERENCES`; however, this dependency can be removed during the restore.
Table with a sequence	The sequence it uses; however, this dependency can be removed during the restore.
Views	The tables used in the view's `SELECT` statement.

Users and privileges

The system.users table stores your users and their passwords. To restore your users and privilege grants, do a cluster backup and restore the cluster to a fresh cluster with no user data. You can also backup the system.users table, and then use this procedure.

Performance

The BACKUP process minimizes its impact to the cluster's performance by distributing work to all nodes. Each node backs up only a specific subset of the data it stores (those for which it serves writes; more details about this architectural concept forthcoming), with no two nodes backing up the same data.

BACKUP, like any read, cannot export a range if the range contains an unresolved intent. While you typically will want bulk, background jobs like BACKUP to have as little impact on your foreground traffic as possible, it's more important for backups to actually complete (which maintains your recovery point objective (RPO)). Unlike a normal read transaction that will block until any uncommitted writes it encounters are resolved, BACKUP will block only for a configurable duration before invoking priority to ensure it can complete on-time.

We recommend always starting backups with a specific timestamp at least 10 seconds in the past. For example:

> BACKUP...AS OF SYSTEM TIME '-10s';

This improves performance by decreasing the likelihood that the BACKUP will be retried because it contends with other statements/transactions. However, because AS OF SYSTEM TIME returns historical data, your reads might be stale. Taking backups with AS OF SYSTEM TIME '-10s' is a good best practice to reduce the number of still-running transactions you may encounter, since the backup will take priority and will force still-running transactions to restart after the backup is finished.

BACKUP will initially ask individual ranges to backup but to skip if they encounter an intent. Any range that is skipped is placed at the end of the queue. When BACKUP has completed its initial pass and is revisiting ranges, it will ask any range that did not resolve within the given time limit (default 1 minute) to attempt to resolve any intents that it encounters and to not skip. Additionally, the backup's transaction priority is then set to high, which causes other transactions to abort until the intents are resolved and the backup is finished.

Note:

To set a target for the amount of backup data written to each backup file, use the bulkio.backup.file_size cluster setting.

See the SET CLUSTER SETTING page for more details on using cluster settings.

Viewing and controlling backups jobs

After CockroachDB successfully initiates a backup, it registers the backup as a job, and you can do the following:

Action	SQL Statement
View the backup status	`SHOW JOBS`
Pause the backup	`PAUSE JOB`
Resume the backup	`RESUME JOB`
Cancel the backup	`CANCEL JOB`

You can also visit the Jobs page of the DB Console to view job details. The BACKUP statement will return when the backup is finished or if it encounters an error.

Note:

The presence of the BACKUP MANIFEST file in the backup subdirectory is an indicator that the backup job completed successfully.

Examples

Per our guidance in the Performance section, we recommend starting backups from a time at least 10 seconds in the past using AS OF SYSTEM TIME.

The following examples make use of:

Amazon S3 connection strings. For guidance on connecting to other storage options or using other authentication parameters instead, read Use Cloud Storage for Bulk Operations.
The default AUTH=specified parameter. For guidance on using AUTH=implicit authentication with Amazon S3 buckets instead, read Use Cloud Storage for Bulk Operations — Authentication.

Note:

The BACKUP ... TO and RESTORE ... FROM syntax is deprecated as of v22.1 and will be removed in a future release.

For guidance on the syntax for backups and restores, see the BACKUP and RESTORE examples.

Backup a cluster

To take a full backup of a cluster:

> BACKUP INTO \
's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';

Backup a database

To take a full backup of a single database:

> BACKUP DATABASE bank \
INTO 's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';

To take a full backup of multiple databases:

> BACKUP DATABASE bank, employees \
INTO 's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';

Backup a table or view

To take a full backup of a single table or view:

> BACKUP bank.customers \
INTO 's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';

To take a full backup of multiple tables:

> BACKUP bank.customers, bank.accounts \
INTO 's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';

Specify a subdirectory for backups

To store the backup in a specific subdirectory in the storage location:

BACKUP DATABASE bank INTO 'subdirectory' IN 's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';

Backup all tables in a schema

New in v21.2: To back up all tables in a specified schema, use a wildcard with the schema name:

> BACKUP test_schema.*
INTO 's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';

Alternatively, use a fully qualified name: database.schema.*.

With this syntax, schemas will be resolved before databases. test_object.* will resolve to a schema of test_object within the set current database before matching to a database of test_object.

If a database and schema have the same name, such as bank.bank, running BACKUP bank.* will result in the schema resolving first. All the tables within that schema will be backed up. However, if this were to be run from a different database that does not have a bank schema, all tables in the bank database will be backed up.

See Name Resolution for more details on how naming hierarchy and name resolution work in CockroachDB.

Create incremental backups

If you backup to a destination already containing a full backup, an incremental backup will be appended to the full backup's path with a date-based name (e.g., 20210324):

> BACKUP INTO LATEST IN \
's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';

Run a backup asynchronously

Use the DETACHED option to execute the backup job asynchronously:

> BACKUP INTO \
's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s'
WITH DETACHED;

The job ID is returned after the backup job creation completes:

        job_id
----------------------
  592786066399264769
(1 row)

Without the DETACHED option, BACKUP will block the SQL connection until the job completes. Once finished, the job status and more detailed job data is returned:

job_id             |  status   | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+--------
652471804772712449 | succeeded |                  1 |   50 |             0 |  4911
(1 row)

Back up with an S3 storage class

New in v21.2.6: To associate your backup objects with a specific storage class in your Amazon S3 bucket, use the S3_STORAGE_CLASS parameter with the class. For example, the following S3 connection URI specifies the INTELLIGENT_TIERING storage class:

BACKUP DATABASE movr INTO 's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}&S3_STORAGE_CLASS=INTELLIGENT_TIERING' AS OF SYSTEM TIME '-10s';

Use the parameter to set one of these storage classes listed in Amazon's documentation. For more general usage information, see Amazon's Using Amazon S3 storage classes documentation.

Warning:

Incremental backups are not compatible with the S3 Glacier Flexible Retrieval or Glacier Deep Archive storage classes. Incremental backups require ad-hoc reading of previous backups. The Glacier Flexible Retrieval or Glacier Deep Archive storage classes do not allow immediate access to S3 objects without first restoring the objects. See Amazon's documentation on Restoring an archived object for more detail.

Advanced examples

For examples of advanced BACKUP and RESTORE use cases, see:

Cockroach
University

Docs Hub

BACKUP

Considerations

Required privileges

Destination privileges

Synopsis

Parameters

Targets

Options

Backup file URLs

Functional details

Object dependencies

Users and privileges

Performance

Viewing and controlling backups jobs

Examples

Backup a cluster

Backup a database

Backup a table or view

Specify a subdirectory for backups

Backup all tables in a schema

Create incremental backups

Run a backup asynchronously

Back up with an S3 storage class

Advanced examples

See also

Cockroach University

Docs Hub

Cockroach University

Docs Hub

BACKUP

Considerations

Required privileges

Destination privileges

Synopsis

Parameters

Targets

Options

Backup file URLs

Functional details

Object dependencies

Users and privileges

Performance

Viewing and controlling backups jobs

Examples

Backup a cluster

Backup a database

Backup a table or view

Specify a subdirectory for backups

Backup all tables in a schema

Create incremental backups

Run a backup asynchronously

Back up with an S3 storage class

Advanced examples

See also

Cockroach
University

Cockroach
University