CockroachDB supports importing data from CSV/TSV or SQL dump files.
Import from Tabular Data (CSV)
If you have data exported in a tabular format (e.g., CSV or TSV), you can use the IMPORT
statement.
To use this statement, though, you must also have some kind of remote file server (such as Amazon S3 or a custom file server) that all your nodes can access.
Import from Generic SQL Dump
You can execute batches of INSERT
statements stored in .sql
files (including those generated by cockroach dump
) from the command line, importing data into your cluster.
$ cockroach sql --database=[database name] < statements.sql
INSERT
statement to include approximately 500-10,000 rows will provide the best performance. The number of rows depends on row size, column families, number of indexes; smaller rows and less complex schemas can benefit from larger groups of INSERTS
, while larger rows and more complex schemas benefit from smaller groups.Import from PostgreSQL Dump
If you're importing data from a PostgreSQL deployment, you can import the .sql
file generated by the pg_dump
command to more quickly import data.
.sql
files generated by pg_dump
provide better performance because they use the COPY
statement instead of bulk INSERT
statements.Create PostgreSQL SQL File
Which pg_dump
command you want to use depends on whether you want to import your entire database or only specific tables:
Entire database:
$ pg_dump [database] > [filename].sql
Specific tables:
$ pg_dump -t [table] [table's schema] > [filename].sql
For more details, see PostgreSQL's documentation on pg_dump
.
Reformat SQL File
After generating the .sql
file, you need to perform a few editing steps before importing it:
- Remove all statements from the file besides the
CREATE TABLE
andCOPY
statements. - Manually add the table's
PRIMARY KEY
constraint to theCREATE TABLE
statement. This has to be done manually because PostgreSQL attempts to add the primary key after creating the table, but CockroachDB requires the primary key be defined upon table creation. - Review any other constraints to ensure they're properly listed on the table.
- Remove any unsupported elements.
Import Data
After reformatting the file, you can import it through psql
:
$ psql -p [port] -h [node host] -d [database] -U [user] < [file name].sql
For reference, CockroachDB uses these defaults:
[port]
: 26257[user]
: root