This page describes how to take and restore customer-owned backups from CockroachDB Serverless and CockroachDB Dedicated clusters.
The examples on this page provide a quick overview of the backup features you can run to your own storage bucket. For more technical detail on the complete list of backup features, refer to:
- Full backup
- Incremental backup
- Scheduled backup
- Backups with revision history
- Point-in-time restore
- Encrypted backup and restore
- Locality-aware backup and restore
- Locality-restricted backup execution
Examples
Before you begin, connect to your cluster. For guidance on connecting to your CockroachDB Cloud cluster, visit Connect to a CockroachDB Serverless Cluster or Connect to Your CockroachDB Dedicated Cluster.
The examples on this page demonstrate how to back up and restore from your own storage bucket.
Full backup
To take a full backup of a cluster:
BACKUP INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s';
To take a full backup of a single database:
BACKUP DATABASE bank INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s';
To take a full backup of a single table or view:
BACKUP bank.customers INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s';
Backup subdirectories
BACKUP ... INTO
adds a backup to a backup collection location. To view the backup paths in a given collection location (your storage bucket), use SHOW BACKUPS
:
SHOW BACKUPS IN 's3://bucket/path?AUTH=implicit';
path
-------------------------
/2023/12/14-190909.83
/2023/12/20-155249.37
/2023/12/21-142943.73
(3 rows)
When you want to restore a specific backup, add the backup's subdirectory path (e.g., /2023/12/21-142943.73
) to the RESTORE
statement.
Restore
To restore from the most recent backup (full or incremental) in the collection's location, use the LATEST
syntax:
RESTORE FROM LATEST IN 's3://bucket/path?AUTH=implicit';
If you are restoring an incremental backup, the storage location must contain a full backup.
You cannot restore a backup of a multi-region database into a single-region database.
To restore a specific full or incremental backup, specify that backup's subdirectory in the RESTORE
statement. To view the available subdirectories, use SHOW BACKUPS
. If you are restoring an incremental backup, the URI must point to the storage location that contains the full backup:
RESTORE FROM '2023/03/23-213101.37' IN 's3://bucket/path?AUTH=implicit';
Incremental backup
When a BACKUP
statement specifies an existing subdirectory in the collection, explicitly or via the LATEST
keyword, an incremental backup will be added to the default /incrementals
directory at the root of the collection storage location.
To take an incremental backup using the LATEST
keyword:
BACKUP INTO LATEST IN 'external://backup_s3' AS OF SYSTEM TIME '-10s';
To store the backup in an existing subdirectory in the collection:
BACKUP INTO {'subdirectory'} IN 'external://backup_s3' AS OF SYSTEM TIME '-10s';
If you intend to take a full backup, we recommend running BACKUP INTO {collectionURI}
without specifying a subdirectory.
To explicitly control where you store your incremental backups, use the incremental_location
option. For more detail, see this example demonstrating the incremental_location
option.
Scheduled backup
This example creates a schedule for a cluster backup with revision history that is taken every day at midnight:
CREATE SCHEDULE schedule_label
FOR BACKUP INTO 's3://test/backups/schedule_test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x'
WITH revision_history
RECURRING '@daily';
schedule_id | name | status | first_run | schedule | backup_stmt
---------------------+----------------+------------------------------------------------+----------------------------------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------
588796190000218113 | schedule_label | PAUSED: Waiting for initial backup to complete | NULL | @daily | BACKUP INTO LATEST IN 's3://test/schedule-test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached
588796190012702721 | schedule_label | ACTIVE | 2020-09-10 16:52:17.280821+00:00 | @weekly | BACKUP INTO 's3://test/schedule-test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached
(2 rows)
Because the FULL BACKUP
clause is not included, CockroachDB also scheduled a full backup to run @weekly
. This is the default cadence for incremental backups RECURRING
> 1 hour but <= 1 day.
Encrypted backup
You can take and restore encrypted backups in the following ways:
- Using AWS Key Management Service (KMS)
- Using Google Cloud Key Management Service (KMS)
- Using Azure Key Vault
- Using a passphrase
Refer to the Take and Restore Encrypted Backups page for technical detail and a full list of examples.
For example, you can run a backup with AWS KMS with the BACKUP
statement's kms
option:
BACKUP INTO 's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}'
WITH kms = 'aws:///{key}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}®ION=us-east-1';
Locality-aware backup
CockroachDB Serverless clusters operate with a different architecture compared to CockroachDB Self-Hosted and CockroachDB Dedicated clusters. These architectural differences have implications for how locality-aware backups can run. Serverless clusters will scale resources depending on whether they are actively in use, which means that it is less likely to have a SQL pod available in every locality. As a result, your Serverless cluster may not have a SQL pod in the locality where the data resides, which can lead to the cluster uploading that data to a storage bucket in a locality where you do have active SQL pods. You should consider this as you plan a backup strategy that must comply with data domiciling requirements.
For example, to create a locality-aware backup where nodes with the locality region=us-west
write backup files to s3://us-west-bucket
, and all other nodes write to s3://us-east-bucket
by default, run:
BACKUP INTO
('s3://us-east-bucket?COCKROACH_LOCALITY=default', 's3://us-west-bucket?COCKROACH_LOCALITY=region%3Dus-west');
When you run the BACKUP
statement for a locality-aware backup, check the following:
- The locality query string parameters must be URL-encoded.
- If you are creating an external connection with
BACKUP
query parameters, you must pass them in uppercase otherwise you will receive anunknown query parameters
error. - A successful locality-aware backup job requires that each node in the cluster has access to each storage location. This is because any node in the cluster can claim the job and become the coordinator node.
You can restore the backup by running:
RESTORE FROM LATEST IN ('s3://us-east-bucket', 's3://us-west-bucket');
Note that the first URI in the list has to be the URI specified as the default
URI when the backup was created. If you have moved your backups to a different location since the backup was originally taken, the first URI must be the new location of the files originally written to the default
location.
To restore from a specific backup, use RESTORE FROM {subdirectory} IN ...
Backup and restore data from userfile
storage
To put files on your CockroachDB cluster without external servers, use userfile
, a per-user file storage.
For information on userfile
commands, visit the following pages:
We recommend starting backups from a time at least 10 seconds in the past using AS OF SYSTEM TIME
. Read our guidance in the Performance section on the BACKUP
page.
Only database and table-level backups are possible when using userfile
as storage. Restoring cluster-level backups will not work because userfile
data is stored in the defaultdb
database, and you cannot restore a cluster with existing table data.
When working on the same cluster, userfile
storage allows for database and table-level backups.
First, run the following statement to backup a database to a directory in the default userfile
space:
BACKUP DATABASE bank INTO 'userfile://defaultdb.public.userfiles_$user/bank-backup' AS OF SYSTEM TIME '-10s';
This directory will hold the files that make up a backup; including the manifest file and data files.
When backing up from a cluster and restoring a database or table that is stored in your userfile
space to a different cluster, you can run cockroach userfile get
to download the backup files to a local machine and cockroach userfile upload --url {CONNECTION STRING}
to upload to the userfile
of the alternate cluster.
BACKUP ... INTO
adds a backup to a collection within the backup destination. The path to the backup is created using a date-based naming scheme by default, unless an explicit subdirectory is passed with the BACKUP
statement. To view the backup paths in a given destination, use SHOW BACKUPS
:
> SHOW BACKUPS IN 'userfile://defaultdb.public.userfiles_$user/bank-backup';
path
------------------------
2021/03/23-213101.37
2021/03/24-172553.85
2021/03/24-210532.53
(3 rows)
Only database and table-level backups are possible when using userfile
as storage. Restoring cluster-level backups will not work because userfile
data is stored in the defaultdb
database, and you cannot restore a cluster with existing table data.
In cases when you need to restore a specific backup, add the backup subdirectory to the RESTORE
statement:
RESTORE DATABASE bank FROM '2021/03/24-210532.53' IN 'userfile://defaultdb.public.userfiles_$user/bank-backup';
It is also possible to run userfile:///bank-backup
as userfile:///
refers to the default path userfile://defaultdb.public.userfiles_$user/
.
To restore from the most recent backup, use RESTORE FROM LATEST IN ...
:
RESTORE FROM LATEST IN 'userfile://defaultdb.public.userfiles_$user/bank-backup';
Once the backup data is no longer needed, delete from the userfile
storage:
cockroach userfile delete bank-backup --url {CONNECTION STRING}
If you use cockroach userfile delete {file}
, it will take as long as the garbage collection to be removed from disk.
To resolve database or table naming conflicts during a restore, see Troubleshooting naming conflicts.