Because CockroachDB is designed with high fault tolerance, backups are primarily needed for disaster recovery (i.e., if your cluster loses a majority of its nodes). Isolated issues (such as small-scale node outages) do not require any intervention. However, as an operational best practice, we recommend taking regular backups of your data.
There are two main types of backups:
Perform backup and restore
You can use the BACKUP
statement to efficiently back up your cluster's schemas and data to popular cloud services such as AWS S3, Google Cloud Storage, or NFS, and the RESTORE
statement to efficiently restore schema and data as necessary. For more information, see Use Cloud Storage for Bulk Operations.
New in v20.2: You can create schedules for periodic backups in CockroachDB. We recommend using scheduled backups to automate daily backups of your cluster.
Full backups
New in v20.2: Full backups are now available to both core and Enterprise users.
In most cases, it's recommended to take full nightly backups of your cluster. A cluster backup allows you to do the following:
- Restore table(s) from the cluster
- Restore database(s) from the cluster
- Restore a full cluster
To do a cluster backup, use the BACKUP
statement:
> BACKUP TO '<backup_location>';
If it's ever necessary, you can use the RESTORE
statement to restore a table:
> RESTORE TABLE bank.customers FROM '<backup_location>';
Or to restore a database:
> RESTORE DATABASE bank FROM '<backup_location>';
Or to restore your full cluster:
> RESTORE FROM '<backup_location>';
A full cluster restore can only be run on a target cluster that has never had user-created databases or tables.
Incremental backups
If your cluster grows too large for nightly full backups, you can take less frequent full backups (e.g., weekly) with nightly incremental backups. Incremental backups are storage efficient and faster than full backups for larger clusters.
To take incremental backups, you need an Enterprise license.
Periodically run the BACKUP
command to take a full backup of your cluster:
> BACKUP TO '<backup_location>';
Then, create nightly incremental backups based off of the full backups you've already created. If you backup to a destination already containing a full backup, an incremental backup will be appended to the full backup in a subdirectory:
> BACKUP TO '<backup_location>';
For an example on how to specify the destination of an incremental backup, see Take Full and Incremental Backups
If it's ever necessary, you can then use the RESTORE
command to restore your cluster, database(s), and/or table(s). Restoring from incremental backups requires previous full and incremental backups. To restore from a destination containing the full backup, as well as the automatically appended incremental backups (that are stored as subdirectories, like in the example above):
> RESTORE FROM '<backup_location>';
RESTORE
will re-validate indexes when incremental backups are created from an older version, but restored from a newer version.
Incremental backups created by v20.2.2 and prior v20.2.x releases or v20.1.4 and prior v20.1.x releases may include incomplete data for indexes that were in the process of being created. Therefore, when incremental backups taken by these versions are restored by v20.2.8+, any indexes created during those incremental backups will be re-validated by RESTORE
.
Incremental backups with explicitly specified destinations
To explicitly control where your incremental backups go, use the INCREMENTAL FROM
syntax:
> BACKUP DATABASE bank \
TO 'gs://acme-co-backup/db/bank/2017-03-29-nightly' \
AS OF SYSTEM TIME '-10s' \
INCREMENTAL FROM 'gs://acme-co-backup/database-bank-2017-03-27-weekly', 'gs://acme-co-backup/database-bank-2017-03-28-nightly' WITH revision_history;
To take incremental backups, you need an Enterprise license.
Examples
Automated full backups
Both core and Enterprise users can use backup scheduling for full backups of clusters, databases, or tables. To create schedules that only take full backups, included the FULL BACKUP ALWAYS
clause. For example, to create a schedule for taking full cluster backups:
> CREATE SCHEDULE core_schedule_label
FOR BACKUP INTO 's3://test/schedule-test-core?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x'
RECURRING '@daily'
FULL BACKUP ALWAYS
WITH SCHEDULE OPTIONS first_run = 'now';
schedule_id | name | status | first_run | schedule | backup_stmt
---------------------+---------------------+--------+---------------------------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
588799238330220545 | core_schedule_label | ACTIVE | 2020-09-11 00:00:00+00:00 | @daily | BACKUP INTO 's3://test/schedule-test-core?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH detached
(1 row)
For more examples on how to schedule backups that take full and incremental backups, see CREATE SCHEDULE FOR BACKUP
.
Advanced examples
For examples of advanced BACKUP
and RESTORE
use cases, see:
- Incremental backups with a specified destination
- Backup with revision history and point-in-time restore
- Locality-aware backup and restore
- Encrypted backup and restore
- Restore into a different database
- Remove the foreign key before restore
- Restoring users from
system.users
backup
To take incremental backups, backups with revision history, locality-aware backups, and encrypted backups, you need an Enterprise license.