Troubleshoot SQL Behavior

On this page

Warning:

As of April 30, 2020, CockroachDB v2.1 is no longer supported. For more details, refer to the Release Support Policy.

If a SQL statement returns an unexpected result or takes longer than expected to process, this page will help you troubleshoot the issue.

`SELECT` statement performance issues

The common reasons for a sub-optimal SELECT performance are inefficient scans, full scans, and incorrect use of indexes. To improve the performance of SELECT statements, refer to the following documents:

Query is always slow

To detect whether your cluster has slow queries, check the service latency graph and the CPU graph. If the graphs show latency spikes or CPU usage spikes, it might indicate slow queries in your cluster.

Once you determine that you do have slow queries in your cluster, use the Statements page to identify the high latency SQL statements. To view the Statements page, access the Admin UI and then click Statements on the left.

You can then use the Statements Details page to drill down to individual statements. You can also use EXPLAIN ANALYZE statement, which executes a SQL query and returns a physical query plan with execution statistics. Query plans provide information around SQL execution, which can be used to troubleshoot slow queries by figuring out where time is being spent, how long a processor (i.e., a component that takes streams of input rows and processes them according to a specification) is not doing work, etc.

If you need help interpreting the output of the EXPLAIN ANALYZE statement, contact us.

Query is sometimes slow

If the query performance is irregular:

Run SHOW TRACE for the query twice: once when the query is performing as expected and once when the query is slow.
Contact us to analyze the outputs of the SHOW TRACE command.

Cancelling running queries

See Cancel query

Low throughput

Throughput is affected by the disk I/O, CPU usage, and network latency. Use the Admin UI to check the following metrics:

Disk I/O: Disk IOPS in progress
CPU usage: CPU percent
Network latency: Service latency SQL 99th percentile

Single hot node

A hot node is one that has much higher resource usage than other nodes. To determine if you have a hot node in your cluster, access the Admin UI, click Metrics on the left, and navigate to the following graphs. Hover over each of the following graphs to see the per-node values of the metrics. If one of the nodes has a higher value, you have a hot node in your cluster.

Replication dashboard > Average queries per store graph.
Overview Dashboard > Service Latency graph
Hardware Dashboard > CPU percent graph
SQL Dashboard > SQL Connections graph
Hardware Dashboard > Disk IOPS in Progress graph

Solution:

If you have a small table that fits into one range, then only one of the nodes will be used. This is expected behavior. However, you can split your range to distribute the table across multiple nodes.
If the SQL Connections graph shows that one node has a higher number of SQL connections and other nodes have zero connections, check if your app is set to talk to only one node.
Check load balancer settings.
Check for transaction contention.
If you have a monotonically increasing index column or Primary Key, then your index or Primary Key should be redesigned. See Unique ID best practices for more information.

INSERT/UPDATE statements are slow

Use the Statements page to identify the slow SQL statements. To view the Statements page, access the Admin UI and then click Statements on the left.

Refer to the following documents to improve INSERT / UPDATE performance:

Per-node queries per second (QPS) is high

If a cluster is not idle, it is useful to monitor the per-node queries per second. Cockroach will automatically distribute load throughout the cluster. If one or more nodes is not performing any queries there is likely something to investigate. See exec_success and exec_errors which track operations at the KV layer and sql_{select,insert,update,delete}_count which track operations at the SQL layer.

Increasing number of nodes does not improve performance

See Why would increasing the number of nodes not result in more operations per second?

`bad connection` & `closed` responses

If you receive a response of bad connection or closed, this normally indicates that the node you connected to died. You can check this by connecting to another node in the cluster and running cockroach node status.

Once you find the downed node, you can check its logs (stored in cockroach-data/logs by default).

Because this kind of behavior is entirely unexpected, you should file an issue.

SQL logging

There are several ways to log SQL queries. The type of logging you use will depend on your requirements.

For per-table audit logs, turn on SQL audit logs.
For system troubleshooting and performance optimization, turn on cluster-wide execution logs.
For local testing, turn on per-node execution logs.

SQL audit logs

Warning:

This is an experimental feature. The interface and output are subject to change.

SQL audit logging is useful if you want to log all queries that are run against specific tables.

For a tutorial, see SQL Audit Logging.
For SQL reference documentation, see ALTER TABLE ... EXPERIMENTAL_AUDIT.

Cluster-wide execution logs

For production clusters, the best way to log all queries is to turn on the cluster-wide setting sql.trace.log_statement_execute:

> SET CLUSTER SETTING sql.trace.log_statement_execute = true;

With this setting on, each node of the cluster writes all SQL queries it executes to a separate log file cockroach-sql-exec.log. When you no longer need to log queries, you can turn the setting back off:

> SET CLUSTER SETTING sql.trace.log_statement_execute = false;

Per-node execution logs

Alternatively, if you are testing CockroachDB locally and want to log queries executed just by a specific node, you can either pass a CLI flag at node startup, or execute a SQL function on a running node.

Using the CLI to start a new node, pass the --vmodule flag to the cockroach start command. For example, to start a single node locally and log all SQL queries it executes, you'd run:

$ cockroach start --insecure --listen-addr=localhost --vmodule=exec_log=2

From the SQL prompt on a running node, execute the crdb_internal.set_vmodule() function:

> SELECT crdb_internal.set_vmodule('exec_log=2');

This will result in the following output:

+---------------------------+
| crdb_internal.set_vmodule |
+---------------------------+
|                         0 |
+---------------------------+
(1 row)

Once the logging is enabled, all of the node's queries will be written to the CockroachDB log file as follows:

I180402 19:12:28.112957 394661 sql/exec_log.go:173  [n1,client=127.0.0.1:50155,user=root] exec "psql" {} "SELECT version()" {} 0.795 1 ""

Something else?

Try searching the rest of our docs for answers or using our other support resources, including:

Was this helpful?

Cockroach
University

Docs Hub

Troubleshoot SQL Behavior

`SELECT` statement performance issues

Query is always slow

Query is sometimes slow

Cancelling running queries

Low throughput

Single hot node

INSERT/UPDATE statements are slow

Per-node queries per second (QPS) is high

Increasing number of nodes does not improve performance

`bad connection` & `closed` responses

SQL logging

SQL audit logs

Cluster-wide execution logs

Per-node execution logs

Something else?

Cockroach University

Docs Hub

Cockroach University

Docs Hub

Troubleshoot SQL Behavior

SELECT statement performance issues

Query is always slow

Query is sometimes slow

Cancelling running queries

Low throughput

Single hot node

INSERT/UPDATE statements are slow

Per-node queries per second (QPS) is high

Increasing number of nodes does not improve performance

bad connection & closed responses

SQL logging

SQL audit logs

Cluster-wide execution logs

Per-node execution logs

Something else?

Cockroach
University

Cockroach
University

`SELECT` statement performance issues

`bad connection` & `closed` responses