The CREATE INDEX
statement creates an index for a table. Indexes improve your database's performance by helping SQL locate data without having to look through every row of a table.
Indexes are automatically created for a table's PRIMARY KEY
and UNIQUE
columns. When querying a table, CockroachDB uses the fastest index. For more information about that process, see Index Selection in CockroachDB.
The following types cannot be included in an index key, but can be stored (and used in a covered query) using the STORING
or COVERING
clause:
To create an index on the schemaless data in a JSONB
column or on the data in an ARRAY
, use a GIN index.
The CREATE INDEX
statement performs a schema change. For more information about how online schema changes work in CockroachDB, see Online Schema Changes.
Required privileges
The user must have the CREATE
privilege on the table.
Synopsis
Standard index
GIN index
Parameters
Parameter | Description |
---|---|
UNIQUE |
Apply the UNIQUE constraint to the indexed columns.This causes the system to check for existing duplicate values on index creation. It also applies the UNIQUE constraint at the table level, so the system checks for duplicate values when inserting or updating data. |
INVERTED |
Create a GIN index on the schemaless data in the specified JSONB column.You can also use the PostgreSQL-compatible syntax USING GIN . For more details, see GIN Indexes. |
IF NOT EXISTS |
Create a new index only if an index of the same name does not already exist; if one does exist, do not return an error. |
opt_index_name index_name |
The name of the index to create, which must be unique to its table and follow these identifier rules. If you do not specify a name, CockroachDB uses the format <table>_<columns>_key/idx . key indicates the index applies the UNIQUE constraint; idx indicates it does not. Example: accounts_balance_idx |
table_name |
The name of the table you want to create the index on. |
USING name |
An optional clause for compatibility with third-party tools. Accepted values for name are btree , gin , and gist , with btree for a standard secondary index, gin as the PostgreSQL-compatible syntax for a GIN index, and gist for a spatial index. |
name |
The name of the column you want to index. For multi-region tables, you can use the crdb_region column within the index in the event the original index may contain non-unique entries across multiple, unique regions. |
ASC or DESC |
Sort the column in ascending (ASC ) or descending (DESC ) order in the index. How columns are sorted affects query results, particularly when using LIMIT .Default: ASC |
STORING ... |
Store (but do not sort) each column whose name you include. For information on when to use STORING , see Store Columns. Note that columns that are part of a table's PRIMARY KEY cannot be specified as STORING columns in secondary indexes on the table.COVERING and INCLUDE are aliases for STORING and work identically. |
opt_partition_by |
An Enterprise-only option that lets you define index partitions at the row level. As of CockroachDB v21.1 and later, most users should use REGIONAL BY ROW tables. Indexes against regional by row tables are automatically partitioned, so explicit index partitioning is not required. |
opt_where_clause |
An optional WHERE clause that defines the predicate boolean expression of a partial index. |
USING HASH |
Creates a hash-sharded index. |
WITH storage_parameter |
A comma-separated list of spatial index tuning parameters. Supported parameters include fillfactor , s2_max_level , s2_level_mod , s2_max_cells , geometry_min_x , geometry_max_x , geometry_min_y , and geometry_max_y . The fillfactor parameter is a no-op, allowed for PostgreSQL-compatibility.For details, see Spatial index tuning parameters. For an example, see Create a spatial index that uses all of the tuning parameters. |
CONCURRENTLY |
Optional, no-op syntax for PostgreSQL compatibility. All indexes are created concurrently in CockroachDB. |
Viewing schema changes
This schema change statement is registered as a job. You can view long-running jobs with SHOW JOBS
.
Examples
Setup
The following examples use MovR, a fictional vehicle-sharing application, to demonstrate CockroachDB SQL statements. For more information about the MovR example application and dataset, see MovR: A Global Vehicle-sharing App.
To follow along, run cockroach demo
to start a temporary, in-memory cluster with the movr
dataset preloaded:
$ cockroach demo
Create standard indexes
To create the most efficient indexes, we recommend reviewing:
Single-column indexes
Single-column indexes sort the values of a single column.
> CREATE INDEX ON users (name);
Because each query can only use one index, single-column indexes are not typically as useful as multiple-column indexes.
Multiple-column indexes
Multiple-column indexes sort columns in the order you list them.
> CREATE INDEX ON users (name, city);
To create the most useful multiple-column indexes, we recommend reviewing our best practices.
Unique indexes
Unique indexes do not allow duplicate values among their columns.
> CREATE UNIQUE INDEX ON users (name, id);
This also applies the UNIQUE
constraint at the table level, similar to ALTER TABLE
. The preceding example is equivalent to:
> ALTER TABLE users ADD CONSTRAINT users_name_id_key UNIQUE (name, id);
Primary key columns that are not specified within a unique index are automatically marked as STORING
in the information_schema.statistics
table and in SHOW INDEX
.
Create GIN indexes
You can create GIN indexes on schemaless data in a JSONB
column.
> CREATE INVERTED INDEX ON promo_codes (rules);
The preceding example is equivalent to the following PostgreSQL-compatible syntax:
> CREATE INDEX ON promo_codes USING GIN (rules);
Create spatial indexes
You can create spatial indexes on GEOMETRY
and GEOGRAPHY
columns. Spatial indexes are a special type of GIN index.
To create a spatial index on a GEOMETRY
column:
CREATE INDEX geom_idx_1 ON some_spatial_table USING GIST(geom);
Unlike GIN indexes, spatial indexes do not support an alternate CREATE INVERTED INDEX ...
syntax. Only the syntax shown here is supported.
For advanced users, there are a number of spatial index tuning parameters that can be passed in using the syntax WITH (var1=val1, var2=val2)
as follows:
CREATE INDEX geom_idx_2
ON some_spatial_table USING GIST(geom)
WITH (s2_max_cells = 20, s2_max_level = 12, s2_level_mod = 3);
Most users should not change the default spatial index settings. There is a risk that you will get worse performance by changing the default settings. For more information , see Spatial indexes.
Store columns
Storing a column improves the performance of queries that retrieve (but do not filter) its values.
> CREATE INDEX ON users (city) STORING (name);
However, to use stored columns, queries must filter another column in the same index. For example, SQL can retrieve name
values from the above index only when a query's WHERE
clause filters city
.
An index that stores all the columns needed by a query is also known as a covering index for that query. When a query has a covering index, CockroachDB can use that index directly instead of doing an "index join" with the primary index, which is likely to be slower.
Change column sort order
To sort columns in descending order, you must explicitly set the option when creating the index. (Ascending order is the default.)
> CREATE INDEX ON users (city DESC, name);
How a column is ordered in the index will affect the ordering of the index keys, and may affect the efficiency of queries that include an ORDER BY
clause.
Query specific indexes
Normally, CockroachDB selects the index that it calculates will scan the fewest rows. However, you can override that selection and specify the name of the index you want to use. To find the name, use SHOW INDEX
.
> SHOW INDEX FROM users;
table_name | index_name | non_unique | seq_in_index | column_name | direction | storing | implicit
+------------+---------------------+------------+--------------+-------------+-----------+---------+----------+
users | users_pkey | false | 1 | city | ASC | false | false
users | users_pkey | false | 2 | id | ASC | false | false
users | users_pkey | false | 3 | name | N/A | true | false
users | users_pkey | false | 4 | address | N/A | true | false
users | users_pkey | false | 5 | credit_card | N/A | true | false
users | users_city_name_idx | true | 1 | city | DESC | false | false
users | users_city_name_idx | true | 2 | name | ASC | false | false
users | users_city_name_idx | true | 3 | id | ASC | false | true
(8 rows)
> SELECT name FROM users@users_name_idx WHERE city='new york';
name
+------------------+
Catherine Nelson
Devin Jordan
James Hamilton
Judy White
Robert Murphy
(5 rows)
You can use the @primary
alias to use the table's primary key in your query if no secondary index explicitly named primary
exists on that table.
Create a hash-sharded secondary index
For performance reasons, we discourage indexing on sequential keys. If, however, you are working with a table that must be indexed on sequential keys, you should use hash-sharded indexes. Hash-sharded indexes distribute sequential traffic uniformly across ranges, eliminating single-range hot spots and improving write performance on sequentially-keyed indexes at a small cost to read performance.
Let's assume the events
table already exists:
> CREATE TABLE events (
product_id INT8,
owner UUID,
serial_number VARCHAR,
event_id UUID,
ts TIMESTAMP,
data JSONB,
PRIMARY KEY (product_id, owner, serial_number, ts, event_id)
);
You can create a hash-sharded index on an existing table:
> CREATE INDEX ON events(ts) USING HASH;
> SHOW INDEX FROM events;
table_name | index_name | non_unique | seq_in_index | column_name | direction | storing | implicit
-------------+---------------+------------+--------------+---------------------------+-----------+---------+-----------
events | events_pkey | false | 1 | product_id | ASC | false | false
events | events_pkey | false | 2 | owner | ASC | false | false
events | events_pkey | false | 3 | serial_number | ASC | false | false
events | events_pkey | false | 4 | ts | ASC | false | false
events | events_pkey | false | 5 | event_id | ASC | false | false
events | events_pkey | false | 6 | data | N/A | true | false
events | events_ts_idx | true | 1 | crdb_internal_ts_shard_16 | ASC | false | true
events | events_ts_idx | true | 2 | ts | ASC | false | false
events | events_ts_idx | true | 3 | product_id | ASC | false | true
events | events_ts_idx | true | 4 | owner | ASC | false | true
events | events_ts_idx | true | 5 | serial_number | ASC | false | true
events | events_ts_idx | true | 6 | event_id | ASC | false | true
(12 rows)
> SHOW COLUMNS FROM events;
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
----------------------------+-----------+-------------+----------------+---------------------------------------------------+-----------------------------+------------
product_id | INT8 | false | NULL | | {events_pkey,events_ts_idx} | false
owner | UUID | false | NULL | | {events_pkey,events_ts_idx} | false
serial_number | VARCHAR | false | NULL | | {events_pkey,events_ts_idx} | false
event_id | UUID | false | NULL | | {events_pkey,events_ts_idx} | false
ts | TIMESTAMP | false | NULL | | {events_pkey,events_ts_idx} | false
data | JSONB | true | NULL | | {events_pkey} | false
crdb_internal_ts_shard_16 | INT8 | false | NULL | mod(fnv32(crdb_internal.datums_to_bytes(ts)), 16) | {events_ts_idx} | true
(7 rows)