This page describes newly identified limitations in the CockroachDB v2.0.7 release as well as unresolved limitations identified in earlier releases.
New Limitations
Changes to the default replication zone are not applied to existing replication zones
Changes to the .default
cluster-wide replication zone are not automatically applied to existing replication zones, including those for important internal data. For the cluster as a whole to remain available, the "system ranges" for this internal data must always retain a majority of their replicas. Therefore, if you increase the default replication factor, be sure to also increase the replication factor for important internal data as well.
Silent validation error with DECIMAL
values
Under the following conditions, the value received by CockroachDB will be different than that sent by the client and may cause incorrect data to be inserted or read from the database, without a visible error message:
- A query uses placeholders (e.g.,
$1
) to pass values to the server. - A value of type
DECIMAL
is passed. - The decimal value is encoded using the binary format.
Most client drivers and frameworks use the text format to pass placeholder values and are thus unaffected by this limitation. However, we know that the Ecto framework for Elixir is affected, and others may be as well. If in doubt, use SQL statement logging to control how CockroachDB receives decimal values from your client.
Enterprise backup/restore during rolling upgrades
In the upgrade process, after upgrading all binaries to v2.0, it's recommended to monitor the cluster's stability and performance for at least one day and only then finalize the upgrade by increasing the version
cluster setting. However, in the window during which binaries are running v2.0 but the cluster version is still not increased, it is not possible to run enterprise BACKUP
and RESTORE
jobs.
Write and update limits for a single statement
A single statement can perform at most 64MiB of combined updates. When a statement exceeds these limits, its transaction gets aborted. Currently, INSERT INTO ... SELECT FROM
and CREATE TABLE AS SELECT
queries may encounter these limits.
To increase these limits, you can update the cluster-wide setting kv.raft.command.max_size
, but note that increasing this setting can affect the memory utilization of nodes in the cluster. For INSERT INTO .. SELECT FROM
queries in particular, another workaround is to manually page through the data you want to insert using separate transactions.
In the v1.1 release, the limit referred to a whole transaction (i.e., the sum of changes done by all statements) and capped both the number and the size of update. In this release, there's only a size limit, and it applies independently to each statement. Note that even though not directly restricted any more, large transactions can have performance implications on the cluster.
Memory flags with non-integer values and a unit suffix
The --cache
and --max-sql-memory
flags of the cockroach start
command do not support non-integer values with a unit suffix, for example, --cache=1.5GiB
.
As a workaround, use integer values or a percentage, for example, --cache=1536MiB
.
Import with a high amount of disk contention
IMPORT
can sometimes fail with a "context canceled" error, or can restart itself many times without ever finishing. If this is happening, it is likely due to a high amount of disk contention. This can be mitigated by setting the kv.bulk_io_write.max_rate
cluster setting to a value below your max disk write speed. For example, to set it to 10MB/s, execute:
> SET CLUSTER SETTING kv.bulk_io_write.max_rate = '10MB';
Check constraints with INSERT ... ON CONFLICT
CHECK
constraints are not properly enforced on updated values resulting from INSERT ... ON CONFLICT
statements. Consider the following example:
> CREATE TABLE ab (a INT PRIMARY KEY, b INT, CHECK (b < 1));
A simple INSERT
statement that fails the Check constraint fails as it should:
> INSERT INTO ab (a,b) VALUES (1, 12312);
pq: failed to satisfy CHECK constraint (b < 1)
However, the same statement with INSERT ... ON CONFLICT
incorrectly succeeds and results in a row that fails the constraint:
> INSERT INTO ab (a, b) VALUES (1,0); -- create some initial valid value
> INSERT INTO ab (a, b) VALUES (1,0) ON CONFLICT (a) DO UPDATE SET b = 123132;
> SELECT * FROM ab;
+---+--------+
| a | b |
+---+--------+
| 1 | 123132 |
+---+--------+
(1 row)
Referring to a CTE by name more than once
It is currently not possible to refer to a common table expression by name more than once.
For example, the following query is invalid because the CTE a
is
referred to twice:
> WITH a AS (VALUES (1), (2), (3))
SELECT * FROM a, a;
Using CTEs with data-modifying statements
Resolved as of v2.1.
If a common table expression containing data-modifying statement is not referred to by the top level query, either directly or indirectly, the data-modifying statement will not be executed at all.
For example, the following query does not insert any row, because the CTE a
is not used:
> WITH a AS (INSERT INTO t(x) VALUES (1), (2), (3))
SELECT * FROM b;
Also, the following query does not insert any row, even though the CTE a
is used, because
the other CTE that uses a
are themselves not used:
> WITH a AS (INSERT INTO t(x) VALUES (1), (2), (3)),
b AS (SELECT * FROM a)
SELECT * FROM c;
To determine whether a modification will effectively take place, use
EXPLAIN
and check whether the desired data
modification is part of the final plan for the overall query.
Using CTEs with views
Resolved as of v2.1.
It is not yet possible to use a common table expression inside the selection query used to define a view.
Using CTEs with VALUES
clauses
Resolved as of v2.1.
It is not yet possible to use a common table expression define outside of a VALUES
clause in a subquery inside the VALUES
clause, for example:
> WITH a AS (...) VALUES ((SELECT * FROM a));
Using CTEs with Set Operations
Resolved as of v2.1.
It is not yet possible to use a common table expression defined outside of a set expression in the right operand of a set operator, for example:
> WITH a AS (SELECT 1)
SELECT * FROM users UNION SELECT * FROM a; -- "a" used on the right, not yet supported.
For UNION
, you can work around this limitation by swapping the operands. For the other set operators, you can inline the definition of the CTE inside the right operand.
Assigning latitude/longitude for the Node Map
You cannot assign latitude/longitude coordinates to localities if the components of your localities have the same name. For example, consider the following partial configuration:
Node | Region | Datacenter |
---|---|---|
Node1 | us-east | datacenter-1 |
Node2 | us-west | datacenter-1 |
In this case, if you try to set the latitude/longitude coordinates to the datacenter level of the localities, you will get the "primary key exists" error and the Node Map will not be displayed. You can, however, set the latitude/longitude coordinates to the region components of the localities, and the Node Map will be displayed.
Placeholders in PARTITION BY
When defining a table partition, either during table creation or table alteration, it is not possible to use placeholders in the PARTITION BY
clause.
Adding a column with sequence-based DEFAULT
values
It is currently not possible to add a column to a table when the column uses a sequence as the DEFAULT
value, for example:
> CREATE TABLE t (x INT);
> INSERT INTO t(x) VALUES (1), (2), (3);
> CREATE SEQUENCE s;
> ALTER TABLE t ADD COLUMN y INT DEFAULT nextval('s');
ERROR: nextval(): unimplemented: cannot evaluate scalar expressions containing sequence operations in this context
SQLSTATE: 0A000
Unresolved Limitations
Database and table renames are not transactional
Database and table renames using RENAME DATABASE
and RENAME TABLE
are not transactional.
Specifically, when run inside a BEGIN
... COMMIT
block, it’s possible for a rename to be half-done - not persisted in storage, but visible to other nodes or other transactions. For more information, see Table renaming considerations. For an issue tracking this limitation, see cockroach#12123.
Available capacity metric in the Admin UI
If you are running multiple nodes on a single machine (not recommended in production) and didn't specify the maximum allocated storage capacity for each node using the --store
flag, the capacity metrics in the Admin UI are incorrect. This is because when multiple nodes are running on a single machine, the machine's hard disk is treated as an available store for each node, while in reality, only one hard disk is available for all nodes. The total available capacity is then calculated as the hard disk size multiplied by the number of nodes on the machine.
Schema changes within transactions
Within a single transaction:
- DDL statements cannot be mixed with DML statements. As a workaround, you can split the statements into separate transactions.
- A
CREATE TABLE
statement containingFOREIGN KEY
orINTERLEAVE
clauses cannot be followed by statements that reference the new table. - A table cannot be dropped and then recreated with the same name. This is not possible within a single transaction because
DROP TABLE
does not immediately drop the name of the table. As a workaround, split theDROP TABLE
andCREATE TABLE
statements into separate transactions.
Schema changes between executions of prepared statements
When the schema of a table targeted by a prepared statement changes after the prepared statement is created, future executions of the prepared statement could result in an error. For example, adding a column to a table referenced in a prepared statement with a SELECT *
clause will result in an error:
CREATE TABLE users (id INT PRIMARY KEY);
PREPARE prep1 AS SELECT * FROM users;
ALTER TABLE users ADD COLUMN name STRING;
INSERT INTO users VALUES (1, 'Max Roach');
EXECUTE prep1;
ERROR: cached plan must not change result type
SQLSTATE: 0A000
It's therefore recommended to explicitly list result columns instead of using SELECT *
in prepared statements, when possible.
INSERT ON CONFLICT
vs. UPSERT
When inserting/updating all columns of a table, and the table has no secondary indexes, we recommend using an UPSERT
statement instead of the equivalent INSERT ON CONFLICT
statement. Whereas INSERT ON CONFLICT
always performs a read to determine the necessary writes, the UPSERT
statement writes without reading, making it faster.
This issue is particularly relevant when using a simple SQL table of two columns to simulate direct KV access. In this case, be sure to use the UPSERT
statement.
Using \|
to perform a large input in the SQL shell
In the built-in SQL shell, using the \|
operator to perform a large number of inputs from a file can cause the server to close the connection. This is because \|
sends the entire file as a single query to the server, which can exceed the upper bound on the size of a packet the server can accept from any client (16MB).
As a workaround, execute the file from the command line with cat data.sql | cockroach sql
instead of from within the interactive shell.
New values generated by DEFAULT
expressions during ALTER TABLE ADD COLUMN
When executing an ALTER TABLE ADD COLUMN
statement with a DEFAULT
expression, new values generated:
- use the default search path regardless of the search path configured in the current session via
SET SEARCH_PATH
. - use the UTC time zone regardless of the time zone configured in the current session via
SET TIME ZONE
. - have no default database regardless of the default database configured in the current session via
SET DATABASE
, so you must specify the database of any tables they reference. - use the transaction timestamp for the
statement_timestamp()
function regardless of the time at which theALTER
statement was issued.
Load-based lease rebalancing in uneven latency deployments
When nodes are started with the --locality
flag, CockroachDB attempts to place the replica lease holder (the replica that client requests are forwarded to) on the node closest to the source of the request. This means as client requests move geographically, so too does the replica lease holder.
However, you might see increased latency caused by a consistently high rate of lease transfers between datacenters in the following case:
- Your cluster runs in datacenters which are very different distances away from each other.
- Each node was started with a single tier of
--locality
, e.g.,--locality=datacenter=a
. - Most client requests get sent to a single datacenter because that's where all your application traffic is.
To detect if this is happening, open the Admin UI, select the Queues dashboard, hover over the Replication Queue graph, and check the Leases Transferred / second data point. If the value is consistently larger than 0, you should consider stopping and restarting each node with additional tiers of locality to improve request latency.
For example, let's say that latency is 10ms from nodes in datacenter A to nodes in datacenter B but is 100ms from nodes in datacenter A to nodes in datacenter C. To ensure A's and B's relative proximity is factored into lease holder rebalancing, you could restart the nodes in datacenter A and B with a common region, --locality=region=foo,datacenter=a
and --locality=region=foo,datacenter=b
, while restarting nodes in datacenter C with a different region, --locality=region=bar,datacenter=c
.
Overload resolution for collated strings
Many string operations are not properly overloaded for collated strings, for example:
> SELECT 'string1' || 'string2';
+------------------------+
| 'string1' || 'string2' |
+------------------------+
| string1string2 |
+------------------------+
(1 row)
> SELECT ('string1' collate en) || ('string2' collate en);
pq: unsupported binary operator: <collatedstring{en}> || <collatedstring{en}>
Max size of a single column family
When creating or updating a row, if the combined size of all values in a single column family exceeds the max range size (64MiB by default) for the table, the operation may fail, or cluster performance may suffer.
As a workaround, you can either manually split a table's columns into multiple column families, or you can create a table-specific zone configuration with an increased max range size.
Simultaneous client connections and running queries on a single node
When a node has both a high number of client connections and running queries, the node may crash due to memory exhaustion. This is due to CockroachDB not accurately limiting the number of clients and queries based on the amount of available RAM on the node.
To prevent memory exhaustion, monitor each node's memory usage and ensure there is some margin between maximum CockroachDB memory usage and available system RAM. For more details about memory usage in CockroachDB, see this blog post.
SQL subexpressions and memory usage
Many SQL subexpressions (e.g., ORDER BY
, UNION
/INTERSECT
/EXCEPT
, GROUP BY
, subqueries) accumulate intermediate results in RAM on the node processing the query. If the operator attempts to process more rows than can fit into RAM, the node will either crash or report a memory capacity error. For more details about memory usage in CockroachDB, see this blog post.
Query planning for OR
expressions
Given a query like SELECT * FROM foo WHERE a > 1 OR b > 2
, even if there are appropriate indexes to satisfy both a > 1
and b > 2
, the query planner performs a full table or index scan because it cannot use both conditions at once.
Privileges for DELETE
and UPDATE
Every DELETE
or UPDATE
statement constructs a SELECT
statement, even when no WHERE
clause is involved. As a result, the user executing DELETE
or UPDATE
requires both the DELETE
and SELECT
or UPDATE
and SELECT
privileges on the table.
cockroach dump
does not support cyclic foreign key references
The cockroach dump
command will successfully create a dump file for a table with a foreign key reference to itself, or a set of tables with a cyclic foreign key dependency (e.g., a depends on b depends on a). That dump file, however, can only be executed after manually editing the output to remove the foreign key definitions from the CREATE TABLE
statements and adding them as ALTER TABLE ... ADD CONSTRAINT
statements after the INSERT
statements.