The RESTORE
statement restores your cluster's schemas and data from a BACKUP
stored on services such as AWS S3, Google Cloud Storage, or NFS.
Because CockroachDB is designed with high fault tolerance, restores are designed primarily for disaster recovery, i.e., restarting your cluster if it loses a majority of its nodes. Isolated issues (such as small-scale node outages) do not require any intervention.
You can restore:
For details on restoring across versions of CockroachDB, see Restoring Backups Across Versions.
The BACKUP ... TO
and RESTORE ... FROM
syntax is deprecated as of v22.1 and will be removed in a future release.
We recommend using the BACKUP ... INTO {collectionURI}
syntax, which creates or adds to a backup collection in your storage location. For restoring backups, we recommend using RESTORE FROM {backup} IN {collectionURI}
with {backup}
being LATEST
or a specific subdirectory.
For guidance on the syntax for backups and restores, see the BACKUP
and RESTORE
examples.
Considerations
RESTORE
cannot restore backups made by newer versions of CockroachDB.RESTORE
is a blocking statement. To run a restore job asynchronously, use theDETACHED
option.RESTORE
no longer requires an Enterprise license, regardless of the options passed to it or to the backup it is restoring.- Zone configurations present on the destination cluster prior to a restore will be overwritten during a cluster restore with the zone configurations from the backed up cluster. If there were no customized zone configurations on the cluster when the backup was taken, then after the restore the destination cluster will use the zone configuration from the
RANGE DEFAULT
configuration. - You cannot restore a backup of a multi-region database into a single-region database.
- When the
exclude_data_from_backup
parameter is set on a table, the table will not contain row data when restored.
Required privileges
- Full cluster restores can only be run by members of the
ADMIN
role. By default, theroot
user belongs to theadmin
role. - For all other restores, the user must have write access (
CREATE
orINSERT
) on all objects affected.
Source privileges
The source file URL does not require the admin
role in the following scenarios:
- S3 and GS using
SPECIFIED
(and notIMPLICIT
) credentials. Azure is alwaysSPECIFIED
by default. - Userfile
The source file URL does require the admin
role in the following scenarios:
- S3 or GS using
IMPLICIT
credentials - Use of a custom endpoint on S3
- Nodelocal
We recommend using cloud storage for bulk operations.
While Cockroach Labs actively tests Amazon S3, Google Cloud Storage, and Azure Storage, we do not test S3-compatible services (e.g., MinIO, Red Hat Ceph).
Synopsis
Parameters
Parameter | Description |
---|---|
table_pattern |
The table or view you want to restore. |
database_name |
The name of the database you want to restore (i.e., restore all tables and views in the database). You can restore an entire database only if you had backed up the entire database. |
collectionURI |
The collection URI where the full backup (and appended incremental backups, if applicable) is stored. For information about this URL structure, see Backup File URLs. |
LATEST |
Restore the most recent backup in the given collection URI. See the Restore from the most recent backup example. |
subdirectory |
Restore from a specific subdirectory in the given collection URI. See the Restore a specific backup example. |
localityURI |
The URI where a locality-aware backup is stored. When restoring from an incremental locality-aware backup, you need to include every locality ever used, even if it was only used once. For information about this URL structure, see Backup File URLs. |
AS OF SYSTEM TIME timestamp |
Restore data as it existed as of timestamp . You can restore point-in-time data if you had taken full or incremental backup with revision history. If the backup was not taken with revision_history , you can use SHOW BACKUP to restore to a time that the backup covers (including in the full or incremental backup). See the example. |
restore_options_list |
Control your backup's behavior with these options. |
Options
You can control RESTORE
behavior using any of the following in the restore_options_list
. To set multiple RESTORE
options, use a comma-separated list:
Option | Value |
Description |
---|---|---|
into_db |
Database name | Use to change the target database for table restores. The target database must exist before a restore with into_db . (Does not apply to database or cluster restores.)Example: WITH into_db = 'newdb' |
new_db_name |
Database name | Rename a database during a restore. The existing backed-up database can remain active while the same database is restored with a different name. Example: RESTORE DATABASE movr ... WITH new_db_name = 'new_movr' |
skip_missing_foreign_keys |
N/A | Use to remove the missing foreign key constraints before restoring. Example: WITH skip_missing_foreign_keys |
skip_missing_sequences |
N/A | Use to ignore sequence dependencies (i.e., the DEFAULT expression that uses the sequence).Example: WITH skip_missing_sequences |
skip_missing_sequence_owners |
N/A | Must be used when restoring either a table that was previously a sequence owner or a sequence that was previously owned by a table. Example: WITH skip_missing_sequence_owners |
skip_missing_views |
N/A | Use to skip restoring views that cannot be restored because their dependencies are not being restored at the same time. Example: WITH skip_missing_views |
skip_localities_check |
N/A | Use to skip checking localities of a cluster before a restore when there are mismatched cluster regions between the backup's cluster and the target cluster. Example: WITH skip_localities_check |
encryption_passphrase |
Passphrase used to create the encrypted backup | The passphrase used to decrypt the file(s) that were encrypted by the BACKUP statement. |
DETACHED |
N/A | When RESTORE runs with DETACHED , the job will execute asynchronously. The job ID is returned after the restore job creation completes. Note that with DETACHED specified, further job information and the job completion status will not be returned. For more on the differences between the returned job data, see the example below. To check on the job status, use the SHOW JOBS statement. To run a restore within a transaction, use the DETACHED option. |
debug_pause_on |
"error" |
Use to have a RESTORE job self pause when it encounters an error. The RESTORE job can then be resumed after the error has been fixed or canceled to rollback the job. Example: WITH debug_pause_on='error' |
incremental_location |
STRING |
Restore an incremental backup from the alternate collection URI the backup was originally taken with. See Restore incremental backups for more detail. |
Backup file URLs
CockroachDB uses the URL provided to construct a secure API call to the service you specify. The URL structure depends on the type of file storage you are using. For more information, see the following:
Use Cloud Storage for Bulk Operations
Note:HTTP storage is not supported for
BACKUP
andRESTORE
.
Functional details
You can restore:
RESTORE
will only restore the latest data in an object (table, database, cluster), or the latest data as per an AS OF SYSTEM TIME
restore. That is, a restore will not include historical data even if you ran your backup with revision_history
. This means that if you issue an AS OF SYSTEM TIME
query on a restored object, the query will fail or the response will be incorrect because there is no historical data to query. For example, if you restore a table at 2022-07-13 10:38:00
, it is not then possible to read or back up that table at 2022-07-13 10:37:00
or earlier. This is also the case for backups with revision_history
that might try to initiate a revision start time earlier than 2022-07-13 10:38:00
.
You can exclude a table's row data from a backup using the exclude_data_from_backup
parameter. With this parameter set, a table will be empty when restored.
Full cluster
A full cluster restore can only be run on a target cluster with no user-created databases or tables. Restoring a full cluster includes:
- All user tables
- Relevant system tables
- All databases
- All tables (which automatically includes their indexes)
- All views
New in v22.1: Also, a full cluster restore will:
- Restore temporary tables to their original database during a full cluster restore.
- Drop the cluster's
defaultdb
andpostgres
pre-loaded databases before the restore begins. You can only restoredefaultdb
andpostgres
if they are present in the original backup.
When you restore a full cluster with an Enterprise license, it will restore the Enterprise license of the cluster you are restoring from. If you want to use a different license in the new cluster, make sure to update the license after the restore is complete.
Databases
Restoring a database will create a new database and restore all of its tables and views. The created database will have the name of the database in the backup.
RESTORE DATABASE backup_database_name FROM LATEST in 'your_backup_collection_URI';
New in v22.1:
To restore a database that already exists in a cluster, use the new_db_name
option with RESTORE
to provide a new name for the database. See the Rename a database on restore example.
If dropping or renaming an existing database is not an option, you can use table restore to restore all tables into the existing database by using the WITH into_db
option.
Tables
You can also restore individual tables (which automatically includes their indexes) or views from a backup. This process uses the data stored in the backup to create entirely new tables or views in the target database.
By default, tables and views are restored into a target database matching the name of the database from which they were backed up. If the target database does not exist, you must create it. You can choose to change the target database with the into_db
option.
The target database must not have tables or views with the same name as the tables or views you're restoring. If any of the restore target's names are being used, you can:
DROP TABLE
,DROP VIEW
, orDROP SEQUENCE
and then restore them. Note that a sequence cannot be dropped while it is being used in a column'sDEFAULT
expression, so those expressions must be dropped before the sequence is dropped, and recreated after the sequence is recreated. Thesetval
function can be used to set the value of the sequence to what it was previously.- Restore the table or view into a different database.
RESTORE
only offers table-level granularity; it does not support restoring subsets of a table.
When restoring an individual table that references a user-defined type (e.g., ENUM
), CockroachDB will first check to see if the type already exists. The restore will attempt the following for each user-defined type within a table backup:
- If there is not an existing type in the cluster with the same name, CockroachDB will create the user-defined type as it exists in the backup.
- If there is an existing type in the cluster with the same name that is compatible with the type in the backup, CockroachDB will map the type in the backup to the type in the cluster.
- If there is an existing type in the cluster with the same name but it is not compatible with the type in the backup, the restore will not succeed and you will be asked to resolve the naming conflict. You can do this by either dropping or renaming the existing user-defined type.
In general, two types are compatible if they are the same kind (e.g., an enum is only compatible with other enums). Additionally, enums are only compatible if they have the same ordered set of elements that have also been created in the same way. For example:
CREATE TYPE t1 AS ENUM ('yes', 'no')
andCREATE TYPE t2 AS ENUM ('yes', 'no')
are compatible.CREATE TYPE t1 AS ENUM ('yes', 'no')
andCREATE TYPE t2 AS ENUM ('no', 'yes')
are not compatible.CREATE TYPE t1 AS ENUM ('yes', 'no')
andCREATE TYPE t2 AS ENUM ('yes'); ALTER TYPE t2 ADD VALUE ('no')
are not compatible because they were not created in the same way.
Object dependencies
Dependent objects must be restored at the same time as the objects they depend on.
Object | Depends On |
---|---|
Table with foreign key constraints | The table it REFERENCES (however, this dependency can be removed during the restore). |
Table with a sequence | The sequence. |
Views | The tables used in the view's SELECT statement. |
Users and privileges
The owner of restored objects will be the user running the restore job. To restore your users and privilege grants, you can do a cluster backup and restore the cluster to a fresh cluster with no user data.
If you are not doing a full cluster restore, the table-level privileges need to be granted to the users after the restore is complete. (By default, the user restoring will become the owner of the restored objects.) To grant table-level privileges after a restore, backup the system.users
table, restore users and their passwords, and then grant the table-level privileges.
Restore types
You can either restore from a full backup or from a full backup with incremental backups, based on the backup files you include:
Restore Type | Parameters |
---|---|
Full backup | Include the path to the full backup destination and the subdirectory of the backup. See the Examples section for syntax of cluster, database, and table restores. |
Full backup + incremental backups |
Include the path that contains the backup collection and the subdirectory containing the incremental backup. See Restore from incremental backups for an example. |
CockroachDB does not support incremental-only restores.
Performance
- The
RESTORE
process minimizes its impact to the cluster's performance by distributing work to all nodes. Subsets of the restored data (known as ranges) are evenly distributed among randomly selected nodes, with each range initially restored to only one node. Once the range is restored, the node begins replicating it others. - When a
RESTORE
fails or is canceled, partially restored data is properly cleaned up. This can have a minor, temporary impact on cluster performance. - New in v22.1: A restore job will pause if a node in the cluster runs out of disk space. See Viewing and controlling restore jobs for information on resuming and showing the progress of restore jobs.
- New in v22.1:
A restore job will pause instead of entering a
failed
state if it continues to encounter transient errors once it has retried a maximum number of times. Once the restore has paused, you can either resume or cancel it.
Restoring to multi-region databases
Restoring to a multi-region database is supported with some limitations. This section outlines details and settings that should be considered when restoring into multi-region databases:
A cluster's regions will be checked before a restore. Mismatched regions between backup and restore clusters will be flagged before the restore begins, which allows for a decision between updating the cluster localities or restoring with the
skip_localities_check
option to continue with the restore regardless.A database that is restored with the
sql.defaults.primary_region
cluster setting will have thePRIMARY REGION
from this cluster setting assigned to the target database.RESTORE
supports restoring non-multi-region tables into a multi-region database and sets the table locality asREGIONAL BY TABLE
to the primary region of the target database.Restoring tables from multi-region databases with table localities set to
REGIONAL BY ROW
,REGIONAL BY TABLE
,REGIONAL BY TABLE IN PRIMARY REGION
, andGLOBAL
to another multi-region database is supported.When restoring a
REGIONAL BY TABLE IN PRIMARY REGION
table, if the primary region is different in the source database to the target database this will be implicitly changed on restore.Restoring a partition of a
REGIONAL BY ROW
table is not supported.REGIONAL BY TABLE
andREGIONAL BY ROW
tables can be restored only if the regions of the backed-up table match those of the target database. All of the following must be true forRESTORE
to be successful:- The regions of the source database and the regions of the destination database have the same set of regions.
- The regions were added to each of the databases in the same order.
- The databases have the same primary region.
The following example would be considered as having mismatched regions because the database regions were not added in the same order and the primary regions do not match.
Running on the source database:
ALTER DATABASE source_database SET PRIMARY REGION "us-east1";
ALTER DATABASE source_database ADD region "us-west1";
Running on the destination database:
ALTER DATABASE destination_database SET PRIMARY REGION "us-west1";
ALTER DATABASE destination_database ADD region "us-east1";
In addition, the following scenario has mismatched regions between the databases since the regions were not added to the database in the same order.
Running on the source database:
ALTER DATABASE source_database SET PRIMARY REGION "us-east1";
ALTER DATABASE source_database ADD region "us-west1";
Running on the destination database:
ALTER DATABASE destination_database SET PRIMARY REGION "us-west1";
ALTER DATABASE destination_database ADD region "us-east1";
ALTER DATABASE destination_database SET PRIMARY REGION "us-east1";
The ordering of regions and how region matching is determined is a known limitation. See the Known Limitations section for the tracking issues on limitations around RESTORE
and multi-region support.
For more on multi-region databases, see the Multi-Region Capabilities Overview.
Viewing and controlling restore jobs
After CockroachDB successfully initiates a restore, it registers the restore as a job, which you can view with SHOW JOBS
.
After the restore has been initiated, you can control it with PAUSE JOB
, RESUME JOB
, and CANCEL JOB
.
If initiated correctly, the statement returns when the restore is finished or if it encounters an error. In some cases, the restore can continue after an error has been returned (the error message will tell you that the restore has resumed in background).
Examples
The following examples make use of:
- Amazon S3 connection strings. For guidance on connecting to other storage options or using other authentication parameters instead, read Use Cloud Storage for Bulk Operations.
- The default
AUTH=specified
parameter. For guidance on usingAUTH=implicit
authentication with Amazon S3 buckets instead, read Use Cloud Storage for Bulk Operations — Authentication.
There are two ways to specify a backup to restore:
The examples in this section demonstrate restoring from the most recent backup using the LATEST
syntax.
View the backup subdirectories
BACKUP ... INTO
adds a backup to a backup collection location. To view the backup paths in a given collection location, use SHOW BACKUPS
:
> SHOW BACKUPS IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}';
path
-------------------------
/2021/12/14-190909.83
/2021/12/20-155249.37
/2021/12/21-142943.73
(3 rows)
When you want to restore a specific backup, add the backup's subdirectory path (e.g., /2021/12/21-142943.73
) to the RESTORE
statement. For details on viewing the most recent backup, see SHOW BACKUP FROM {subdirectory} in {collectionURI}
.
Restore the most recent backup
New in v22.1:
To restore from the most recent backup in the collection's location, use the LATEST
syntax:
RESTORE FROM LATEST IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}';
Restore a specific backup
To restore a specific backup, use the backup's subdirectory in the collection's location:
RESTORE FROM '2021/03/23-213101.37' IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}';
To view the available subdirectories, use SHOW BACKUPS
.
Restore a cluster
To restore a full cluster:
RESTORE FROM LATEST IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}';
To view the available subdirectories, use SHOW BACKUPS
.
Restore a database
To restore a database:
RESTORE DATABASE bank FROM LATEST IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}';
To view the available subdirectories, use SHOW BACKUPS
.
RESTORE DATABASE
can only be used if the entire database was backed up.
Restore a table
To restore a single table:
> RESTORE TABLE bank.customers FROM LATEST IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}';
To restore multiple tables:
> RESTORE TABLE bank.customers, bank.accounts FROM LATEST IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}';
To view the available subdirectories, use SHOW BACKUPS
.
Restore from incremental backups
To restore the most recent incremental backup from a location containing the full and incremental backup:
RESTORE DATABASE bank FROM LATEST IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}';
When you restore from an incremental backup, you're restoring the entire table, database, or cluster. CockroachDB uses both the latest (or a specific) incremental backup and the full backup during this process. You cannot restore an incremental backup without a full backup. Furthermore, it is not possible to restore over a table, database, or cluster with existing data. See Restore types for detail on the types of backups you can restore.
RESTORE
will re-validate indexes when incremental backups are created from an older version (v20.2.2 and earlier or v20.1.4 and earlier), but restored by a newer version (v21.1.0+). These earlier releases may have included incomplete data for indexes that were in the process of being created.
Restore with AS OF SYSTEM TIME
Running a backup with revision history captures every change made within the garbage collection period leading up to and including the given timestamp, which allows you to restore to an arbitrary point-in-time within the revision history.
If you ran a backup without revision_history
, it is still possible to use AS OF SYSTEM TIME
with RESTORE
to target a particular time for the restore. However, your restore will be limited to the times of the full backup and each incremental backup in the chain. In this case, use the following example to restore to a particular time.
First, find the times that are available for a point-in-time-restore by listing the available backup directories in your storage location:
SHOW BACKUPS IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}';
path
------------------------
2023/01/18-141753.98
2023/01/23-184816.10
2023/01/23-185448.11
(3 rows)
From the output use the required date directory and run the following to get the details of the backup:
SHOW BACKUP '2023/01/23-185448.11' IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}';
database_name | parent_schema_name | object_name | object_type | backup_type | start_time | end_time | size_bytes | rows | is_full_cluster
----------------+--------------------+----------------------------+-------------+-------------+----------------------------+----------------------------+------------+------+------------------
movr | public | vehicle_location_histories | table | full | NULL | 2023-01-23 18:54:48.116975 | 85430 | 1092 | t
movr | public | promo_codes | table | full | NULL | 2023-01-23 18:54:48.116975 | 225775 | 1003 | t
movr | public | user_promo_codes | table | full | NULL | 2023-01-23 18:54:48.116975 | 1009 | 11 | t
NULL | NULL | system | database | incremental | 2023-01-23 18:54:48.116975 | 2023-01-24 00:00:00 | NULL | NULL | t
system | public | users | table | incremental | 2023-01-23 18:54:48.116975 | 2023-01-24 00:00:00 | 0 | 0 | t
system | public | zones | table | incremental | 2023-01-23 18:54:48.116975 | 2023-01-24 00:00:00 | 0 | 0 |
Finally, use the start_time
and end_time
detail to define the required time as part of the AS OF SYSTEM TIME
clause. Run the restore, passing the directory and the timestamp:
RESTORE DATABASE movr FROM '2023/01/23-185448.11' IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}' AS OF SYSTEM TIME '2023-01-23 18:56:48';
Restore a backup asynchronously
Use the DETACHED
option to execute the restore job asynchronously:
> RESTORE FROM LATEST IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}'
WITH DETACHED;
The job ID is returned after the restore job creation completes:
job_id
----------------------
592786066399264769
(1 row)
Without the DETACHED
option, RESTORE
will block the SQL connection until the job completes. Once finished, the job status and more detailed job data is returned:
job_id | status | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+--------
652471804772712449 | succeeded | 1 | 50 | 0 | 4911
(1 row)
Other restore usages
Restore tables into a different database
By default, tables and views are restored to the database they originally belonged to. However, using the into_db
option, you can control the target database. Note that the target database must exist prior to the restore.
First, create the new database that you'll restore the table or view into:
> CREATE DATABASE newdb;
Next, restore the table into the newly created database with into_db
:
> RESTORE bank.customers FROM LATEST IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}'
WITH into_db = 'newdb';
Rename a database on restore
New in v22.1:
To rename a database on restore, use the new_db_name
option:
RESTORE DATABASE bank FROM LATEST IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}'
WITH new_db_name = 'new_bank';
When you run RESTORE
with new_db_name
, the existing database that was originally backed up can remain active:
database_name
--------------+
defaultdb
bank
new_bank
postgres
system
Remove the foreign key before restore
By default, tables with foreign key constraints must be restored at the same time as the tables they reference. However, using the skip_missing_foreign_keys
option you can remove the foreign key constraint from the table and then restore it.
> RESTORE bank.accounts FROM LATEST IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}'
WITH skip_missing_foreign_keys;
Restoring users from system.users
backup
The system.users
table stores your cluster's usernames and their hashed passwords. To restore them, you must restore the system.users
table into a new database because you cannot drop the existing system.users
table.
After it's restored into a new database, you can write the restored users
table data to the cluster's existing system.users
table.
First, create the new database that you'll restore the system.users
table into:
> CREATE DATABASE newdb;
> RESTORE system.users FROM LATEST IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}'
WITH into_db = 'newdb';
After the restore completes, add the users
to the existing system.users
table:
> INSERT INTO system.users SELECT * FROM newdb.users;
> DROP TABLE newdb.users;
Restore from incremental backups in a different location
New in v22.1:
To restore an incremental backup that was taken using the incremental_location
option, you must run the RESTORE
statement with both:
- the collection URI of the full backup
- the
incremental_location
option referencing the incremental backup's collection URI, as passed in the originalBACKUP
statement
RESTORE TABLE movr.users FROM LATEST IN 's3://{bucket_name}?AWS_ACCESS_KEY_ID={key_id}&AWS_SECRET_ACCESS_KEY={access_key}' WITH incremental_location = '{incremental_backup_URI}';
For more detail on using this option with BACKUP
, see Incremental backups with explicitly specified destinations.
Known limitations
- To successfully restore a table into a multi-region database, it is necessary for the order and regions to match between the source and destination database. See the Known Limitations page for detail on ordering and matching regions. Tracking GitHub Issue
- Restoring
GLOBAL
andREGIONAL BY TABLE
tables into a non-multi-region database is not supported. Tracking GitHub Issue