This page describes basic considerations and provides a basic example of migrating data from PostgreSQL to CockroachDB. The information on this page assumes that you have read Migration Overview, which describes the broad phases and considerations of migrating a database to CockroachDB.
The PostgreSQL migration example on this page demonstrates how to use MOLT tooling to update the PostgreSQL schema, perform an initial load of data, and validate the data. These steps are essential when preparing for a full migration.
If you need help migrating to CockroachDB, contact our sales team.
Syntax differences
CockroachDB supports the PostgreSQL wire protocol and is largely compatible with PostgreSQL syntax.
For syntax differences, refer to Features that differ from PostgreSQL.
Unsupported features
The following PostgreSQL features do not yet exist in CockroachDB:
- Triggers. These must be implemented in your application logic.
- Events.
Drop primary key.
Note:Each table must have a primary key associated with it. You can drop and add a primary key constraint within a single transaction.
XML functions.
Column-level privileges.
XA syntax.
Creating a database from a template.
Foreign data wrappers.
Advisory Lock Functions (although some functions are defined with no-op implementations).
Load PostgreSQL data
You can use one of the following methods to migrate PostgreSQL data to CockroachDB:
Use
IMPORT INTO
to migrate CSV, TSV, or Avro data stored via userfile or cloud storage into pre-existing tables on CockroachDB. This option achieves the highest throughput, but requires taking the tables offline to achieve its import speed.Tip:For best practices for optimizing import performance in CockroachDB, see Import Performance Best Practices.Use a third-party data migration tool (e.g., AWS DMS, Qlik, Striim) to load the data.
When migrating from PostgreSQL, you can use
COPY FROM
to copy CSV or tab-delimited data to your CockroachDB tables. This option enables your tables to remain online and accessible. However, it is slower than usingIMPORT INTO
.
The following example uses IMPORT INTO
to perform the initial data load.
Example: Migrate frenchtowns
to CockroachDB
The following steps demonstrate converting a schema, performing an initial load of data, and validating data consistency during a migration.
In the context of a full migration, these steps ensure that PostgreSQL data can be properly migrated to CockroachDB and your application queries tested against the cluster. For details, see the Migration Overview.
Before you begin
The example uses a modified version of the PostgreSQL french-towns-communes-francais
data set and demonstrates how to migrate the schema and data to a Serverless cluster. To follow along with these steps:
Download the
frenchtowns
data set:curl -O https://cockroachdb-migration-examples.s3.us-east-1.amazonaws.com/postgresql/frenchtowns/frenchtowns.sql
Create a
frenchtowns
database on your PostgreSQL instance:createdb frenchtowns
Load the
frenchtowns
data into PostgreSQL, specifying the path of the downloaded file:psql frenchtowns -a -f frenchtowns.sql
Create a free Cloud account, which is used to access the Schema Conversion Tool and create the Serverless cluster.
If you are migrating to a CockroachDB Self-Hosted database, you can export the converted schema from the Schema Conversion Tool and execute the statements in cockroach sql
, or use a third-party schema migration tool such as Alembic, Flyway, or Liquibase.
Step 1. Convert the PostgreSQL schema
Use the Schema Conversion Tool to convert the frenchtowns
schema for compatibility with CockroachDB. The schema has three tables: regions
, departments
, and towns
.
Dump the PostgreSQL
frenchtowns
schema with the followingpg_dump
command:pg_dump --schema-only frenchtowns > frenchtowns_schema.sql
Open the Schema Conversion Tool in the Cloud Console and add a new PostgreSQL schema.
After conversion is complete, review the results. The Summary Report shows that there are errors under Required Fixes. You must resolve these in order to migrate the schema to CockroachDB.
Tip:You can also add your PostgreSQL database credentials to have the Schema Conversion Tool obtain the schema directly from the PostgreSQL database.Missing user: postgres
errors indicate that the SQL userpostgres
is missing from CockroachDB. Click Add User to create the user.Miscellaneous Errors
includes aSELECT pg_catalog.set_config('search_path', '', false)
statement that can safely be removed. Click Delete to remove the statement from the schema.Review the
CREATE SEQUENCE
statements listed under Suggestions. Cockroach Labs does not recommend using a sequence to define a primary key column. For more information, see Unique ID best practices.For this example, Acknowledge the suggestion without making further changes. In practice, after conducting the full migration to CockroachDB, you would modify your CockroachDB schema to use unique and non-sequential primary keys.
Click Retry Migration. The Summary Report now shows that there are no errors. This means that the schema is ready to migrate to CockroachDB.
This example migrates directly to a Serverless cluster. If you are migrating to a CockroachDB Self-Hosted database, you can export the converted schema from the Schema Conversion Tool and execute the statements in
cockroach sql
, or use a third-party schema migration tool such as Alembic, Flyway, or Liquibase.Click Migrate Schema to create a new Serverless cluster with the converted schema. Name the database
frenchtowns
.You can view this database on the Databases page of the Cloud Console.
Step 2. Load the PostgreSQL data
Load the frenchtowns
data into CockroachDB using IMPORT INTO
with CSV-formatted data. IMPORT INTO
requires that you export one file per table with the following attributes:
- Files must be in valid CSV (comma-separated values) or TSV (tab-separated values) format.
- The delimiter must be a single character. Use the
delimiter
option to set a character other than a comma (such as a tab, for TSV format). - Files must be UTF-8 encoded.
- If one of the following characters appears in a field, the field must be enclosed by double quotes:
- Delimiter (
,
by default). - Double quote (
"
). Because the field will be enclosed by double quotes, escape a double quote inside a field by preceding it with another double quote. For example:"aaa","b""bb","ccc"
. - Newline (
\n
). - Carriage return (
\r
).
- Delimiter (
- If a column is of type
BYTES
, it can either be a valid UTF-8 string or a hex-encoded byte literal beginning with\x
. For example, a field whose value should be the bytes1
,2
would be written as\x0102
.
By default, IMPORT INTO
invalidates all foreign key constraints on the target table.
Dump each table in the PostgreSQL
frenchtowns
database to a CSV-formatted file:psql frenchtowns -c "COPY regions TO stdout DELIMITER ',' CSV;" > regions.csv
psql frenchtowns -c "COPY departments TO stdout DELIMITER ',' CSV;" > departments.csv
psql frenchtowns -c "COPY towns TO stdout DELIMITER ',' CSV;" > towns.csv
Host the files where the CockroachDB cluster can access them.
Each node in the CockroachDB cluster needs to have access to the files being imported. There are several ways for the cluster to access the data; for more information on the types of storage
IMPORT INTO
can pull from, see the following:Cloud storage such as Amazon S3 or Google Cloud is highly recommended for hosting the data files you want to import.
The dump files generated in the preceding step are already hosted on a public S3 bucket created for this example.
Open a SQL shell to the CockroachDB
frenchtowns
cluster. To find the command, open the Connect dialog in the Cloud Console and select thefrenchtowns
database and CockroachDB Client option. It will look like:cockroach sql --url "postgresql://{username}@{hostname}:{port}/frenchtowns?sslmode=verify-full"
Use
IMPORT INTO
to import each PostgreSQL dump file into the corresponding table in thefrenchtowns
database.The following commands point to a public S3 bucket where the
frenchtowns
data dump files are hosted for this example.Tip:You can add therow_limit
option to specify the number of rows to import. For example,row_limit = '10'
will import the first 10 rows of the table. This option is useful for finding errors quickly before executing a more time- and resource-consuming import.IMPORT INTO regions CSV DATA ( 'https://cockroachdb-migration-examples.s3.us-east-1.amazonaws.com/postgresql/frenchtowns/regions.csv' );
job_id | status | fraction_completed | rows | index_entries | bytes ---------------------+-----------+--------------------+------+---------------+-------- 893753132185026561 | succeeded | 1 | 26 | 52 | 2338
IMPORT INTO departments CSV DATA ( 'https://cockroachdb-migration-examples.s3.us-east-1.amazonaws.com/postgresql/frenchtowns/departments.csv' );
job_id | status | fraction_completed | rows | index_entries | bytes ---------------------+-----------+--------------------+------+---------------+-------- 893753147892465665 | succeeded | 1 | 100 | 300 | 11166
IMPORT INTO towns CSV DATA ( 'https://cockroachdb-migration-examples.s3.us-east-1.amazonaws.com/postgresql/frenchtowns/towns.csv' );
job_id | status | fraction_completed | rows | index_entries | bytes ---------------------+-----------+--------------------+-------+---------------+---------- 893753162225680385 | succeeded | 1 | 36684 | 36684 | 2485007
Recall that
IMPORT INTO
invalidates all foreign key constraints on the target table. View the constraints that are defined ondepartments
andtowns
:SHOW CONSTRAINTS FROM departments;
table_name | constraint_name | constraint_type | details | validated --------------+-------------------------+-----------------+---------------------------------------------------------+------------ departments | departments_capital_key | UNIQUE | UNIQUE (capital ASC) | t departments | departments_code_key | UNIQUE | UNIQUE (code ASC) | t departments | departments_name_key | UNIQUE | UNIQUE (name ASC) | t departments | departments_pkey | PRIMARY KEY | PRIMARY KEY (id ASC) | t departments | departments_region_fkey | FOREIGN KEY | FOREIGN KEY (region) REFERENCES regions(code) NOT VALID | f
SHOW CONSTRAINTS FROM towns;
table_name | constraint_name | constraint_type | details | validated -------------+---------------------------+-----------------+-----------------------------------------------------------------+------------ towns | towns_code_department_key | UNIQUE | UNIQUE (code ASC, department ASC) | t towns | towns_department_fkey | FOREIGN KEY | FOREIGN KEY (department) REFERENCES departments(code) NOT VALID | f towns | towns_pkey | PRIMARY KEY | PRIMARY KEY (id ASC) | t
To validate the foreign keys, issue an
ALTER TABLE ... VALIDATE CONSTRAINT
statement for each table:ALTER TABLE departments VALIDATE CONSTRAINT departments_region_fkey;
ALTER TABLE towns VALIDATE CONSTRAINT towns_department_fkey;
Step 3. Validate the migrated data
Use MOLT Verify to check that the data on PostgreSQL and CockroachDB are consistent.
In the directory where you installed MOLT Verify, use the following command to compare the two databases, specifying the PostgreSQL connection string with
--source
and the CockroachDB connection string with--target
:Tip:To find the CockroachDB connection string, open the Connect dialog in the Cloud Console and select thefrenchtowns
database and the General connection string option../molt verify --source 'postgresql://{username}:{password}@{host}:{port}/frenchtowns' --target 'postgresql://{user}:{password}@{host}:{port}/frenchtowns?sslmode=verify-full'
You will see the initial output:
<nil> INF verification in progress
The following output indicates that MOLT Verify has completed verification:
<nil> INF finished row verification on public.regions (shard 1/1): truth rows seen: 26, success: 26, missing: 0, mismatch: 0, extraneous: 0, live_retry: 0 <nil> INF finished row verification on public.departments (shard 1/1): truth rows seen: 100, success: 100, missing: 0, mismatch: 0, extraneous: 0, live_retry: 0 <nil> INF progress on public.towns (shard 1/1): truth rows seen: 10000, success: 10000, missing: 0, mismatch: 0, extraneous: 0, live_retry: 0 <nil> INF progress on public.towns (shard 1/1): truth rows seen: 20000, success: 20000, missing: 0, mismatch: 0, extraneous: 0, live_retry: 0 <nil> INF progress on public.towns (shard 1/1): truth rows seen: 30000, success: 30000, missing: 0, mismatch: 0, extraneous: 0, live_retry: 0 <nil> INF finished row verification on public.towns (shard 1/1): truth rows seen: 36684, success: 36684, missing: 0, mismatch: 0, extraneous: 0, live_retry: 0 <nil> INF verification complete
With the schema migrated and the initial data load verified, the next steps in a real-world migration are to ensure that you have made any necessary application changes, validate application queries, and perform a dry run before conducting the full migration.
To learn more, see the Migration Overview.