IMPORT

On this page

Warning:

As of October 4, 2019, CockroachDB v2.0 is no longer supported. For more details, refer to the Release Support Policy.

The IMPORT statement imports tabular data (e.g., CSVs) into a single table.

Note:

For details about importing SQL dumps, see Import Data.

Requirements

Before using IMPORT, you should have:

The schema of the table you want to import.
The tabular data you want to import (e.g., CSV), preferably hosted on cloud storage. This location must be accessible to all nodes using the same address. This means that you cannot use a node's local file storage.

For ease of use, we recommend using cloud storage. However, if that isn't readily available to you, we also have a guide on easily creating your own file server.

Details

Import Targets

Imported tables must not exist and must be created in the IMPORT statement. If the table you want to import already exists, you must drop it with DROP TABLE.

You can only import a single table at a time.

You can specify the target database in the table name in the IMPORT statement. If it's not specified there, the active database in the SQL session is used.

Create Table

Your IMPORT statement must include a CREATE TABLE statement (representing the schema of the data you want to import) using one of the following methods:

A reference to a file that contains a CREATE TABLE statement
An inline CREATE TABLE statement

We also recommend all secondary indexes you want to use in the CREATE TABLE statement. It is possible to add secondary indexes later, but it is significantly faster to specify them during import.

CSV Data

The tabular data to import must be valid CSV files, with the caveat that the comma delimiter can be set to another single character. In particular:

Files must be UTF-8 encoded.
If the delimiter (, by default), a double quote ("), newline (\n), or carriage return (\r) appears in a field, the field must be enclosed by double quotes.
If double quotes are used to enclose fields, then a double quote appearing inside a field must be escaped by preceding it with another double quote. For example: "aaa","b""bb","ccc"

CockroachDB-specific requirements:

If a column is of type BYTES, it can either be a valid UTF-8 string or a string literal beginning with the two characters \, x. For example, a field whose value should be the bytes 1, 2 would be written as \x0102.

Object Dependencies

When importing tables, you must be mindful of the following rules because IMPORT only creates single tables which must not already exist:

Objects that the imported table depends on must already exist
Objects that depend on the imported table can only be created after the import completes

Available Storage Requirements

Each node in the cluster is assigned an equal part of the converted CSV data, and so must have enough temp space to store it. In addition, data is persisted as a normal table, and so there must also be enough space to hold the final, replicated data. The node's first-listed/default store directory must have enough available storage to hold its portion of the data.

On cockroach start, if you set --max-disk-temp-storage, it must also be greater than the portion of the data a node will store in temp space.

Import File Location

You can store the tabular data you want to import using remote cloud storage (Amazon S3, Google Cloud Platform, etc.). Alternatively, you can use an HTTP server accessible from all nodes.

For simplicity's sake, it's strongly recommended to use cloud/remote storage for the data you want to import. Local files are supported; however, they must be accessible identically from all nodes in the cluster.

Table Users and Privileges

Imported tables are treated as new tables, so you must GRANT privileges to them.

Performance

All nodes are used during tabular data conversion into key-value data, which means all nodes' CPU and RAM will be partially consumed by the IMPORT task in addition to serving normal traffic.

Viewing and Controlling Import Jobs

After CockroachDB successfully initiates an import, it registers the import as a job, which you can view with SHOW JOBS.

After the import has been initiated, you can control it with PAUSE JOB, RESUME JOB, and CANCEL JOB.

Warning:

Pausing and then resuming an `IMPORT` job will cause it to restart from the beginning.

Synopsis

Note:

The IMPORT statement cannot be used within a transaction.

Required Privileges

Only the root user can run IMPORT.

Parameters

Parameter	Description
`table_name`	The name of the table you want to import/create.
`create_table_file`	The URL of a plain text file containing the `CREATE TABLE` statement you want to use (see this example for syntax).
`table_elem_list`	The table definition you want to use (see this example for syntax).
`file_to_import`	The URL of the file you want to import.
`WITH kv_option`	Control your import's behavior with these options.

Import File URLs

URLs for the files you want to import must use the following format:

[scheme]://[host]/[path]?[parameters]

Location	scheme	host	parameters
Amazon S3	`s3`	Bucket name	`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`
Azure	`azure`	Container name	`AZURE_ACCOUNT_KEY`, `AZURE_ACCOUNT_NAME`
Google Cloud ¹	`gs`	Bucket name	`AUTH` (optional): can be `default` or `implicit`
HTTP ²	`http`	Remote host	N/A
NFS/Local ³	`nodelocal`	File system location	N/A
S3-compatible services ⁴	`s3`	Bucket name	`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION`, `AWS_ENDPOINT`

Considerations

¹ If the AUTH parameter is implicit, all GCS connections use Google's default authentication strategy. If the AUTH parameter is default, the cloudstorage.gs.default.key cluster setting must be set to the contents of a service account file which will be used during authentication. If the AUTH parameter is not specified, the cloudstorage.gs.default.key setting will be used if it is non-empty, otherwise the implicit behavior is used.
² You can easily create your own HTTP server with Caddy or nginx. A custom root CA can be appended to the system's default CAs by setting the cloudstorage.http.custom_ca cluster setting, which will be used when verifying certificates from HTTPS URLs.
³ The file system backup location on the NFS drive is relative to the path specified by the --external-io-dir flag set while starting the node. If the flag is set to disabled, then imports from local directories and NFS drives are disabled.
⁴ A custom root CA can be appended to the system's default CAs by setting the cloudstorage.http.custom_ca cluster setting, which will be used when verifying certificates from an S3-compatible service.
The location parameters often contain special characters that need to be URI-encoded. Use Javascript's encodeURIComponent function or Go language's url.QueryEscape function to URI-encode the parameters. Other languages provide similar functions to URI-encode special characters.

Import Options

You can control the IMPORT process's behavior using any of the following key-value pairs as a kv_option.

`delimiter`

If not using comma as your column delimiter, you can specify another Unicode character as the delimiter.

Required?	No
Key	`delimiter`
Value	The unicode character that delimits columns in your rows
Example	To use tab-delimited values: `WITH delimiter = e'\t'`

`comment`

Do not import rows that begin with this character.

Required?	No
Key	`comment`
Value	The unicode character that identifies rows to skip
Example	`WITH comment = '#'`

`nullif`

Convert values to SQL NULL if they match the specified string.

Required?	No
Key	`nullif`
Value	The string that should be converted to NULL
Example	To use empty columns as NULL: `WITH nullif = ''`

Examples

Use Create Table Statement from a File

> IMPORT TABLE customers
CREATE USING 'azure://acme-co/customer-create-table.sql?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co'
CSV DATA ('azure://acme-co/customer-import-data.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
;

Use Create Table Statement from a Statement

> IMPORT TABLE customers (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
;

Import a Tab-Separated File

> IMPORT TABLE customers (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.tsc?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
    delimiter = e'\t'
;

Skip Commented Lines

> IMPORT TABLE customers (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.tsc?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
    comment = '#'
;

Use Blank Characters as NULL

> IMPORT TABLE customers (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.tsc?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
    nullif = ''
;

Known Limitation

IMPORT can sometimes fail with a "context canceled" error, or can restart itself many times without ever finishing. If this is happening, it is likely due to a high amount of disk contention. This can be mitigated by setting the kv.bulk_io_write.max_rate cluster setting to a value below your max disk write speed. For example, to set it to 10MB/s, execute:

> SET CLUSTER SETTING kv.bulk_io_write.max_rate = '10MB';

Cockroach
University

Docs Hub

IMPORT

Requirements

Details

Import Targets

Create Table

CSV Data

Object Dependencies

Available Storage Requirements

Import File Location

Table Users and Privileges

Performance

Viewing and Controlling Import Jobs

Synopsis

Required Privileges

Parameters

Import File URLs

Considerations

Import Options

`delimiter`

`comment`

`nullif`

Examples

Use Create Table Statement from a File

Use Create Table Statement from a Statement

Import a Tab-Separated File

Skip Commented Lines

Use Blank Characters as NULL

Known Limitation

See Also

Cockroach University

Docs Hub

Cockroach University

Docs Hub

IMPORT

Requirements

Details

Import Targets

Create Table

CSV Data

Object Dependencies

Available Storage Requirements

Import File Location

Table Users and Privileges

Performance

Viewing and Controlling Import Jobs

Synopsis

Required Privileges

Parameters

Import File URLs

Considerations

Import Options

delimiter

comment

nullif

Examples

Use Create Table Statement from a File

Use Create Table Statement from a Statement

Import a Tab-Separated File

Skip Commented Lines

Use Blank Characters as NULL

Known Limitation

See Also

Cockroach
University

Cockroach
University

`delimiter`

`comment`

`nullif`