This page walks you through TPC-C performance benchmarking on CockroachDB. It measures tpmC (new order transactions/minute) on two TPC-C datasets:
- 1,000 warehouses (for a total dataset size of 200GB) on 3 nodes
- 10,000 warehouses (for a total dataset size of 2TB) on 30 nodes
These two points on the spectrum show how CockroachDB scales from modest-sized production workloads to larger-scale deployments. This demonstrates how CockroachDB achieves high OLTP performance of over 128,000 tpmC on a TPC-C dataset over 2TB in size.
Benchmark a small cluster
Step 1. Create 3 Google Cloud Platform GCE instances
Create 3 instances for your CockroachDB nodes. While creating each instance:
Use the
n1-highcpu-16
machine type.For our TPC-C benchmarking, we use
n1-highcpu-16
machines. Currently, we believe this (or higher vCPU count machines) is the best configuration for CockroachDB under high traffic scenarios.Create and mount a local SSD using a SCSI interface.
We attach a single local SSD to each virtual machine. Local SSDs are low latency disks attached to each VM, which maximizes performance. We chose this configuration because it best resembles what a bare metal deployment would look like, with machines directly connected to one physical disk each. We do not recommend using network-attached block storage.
Optimize the local SSD for write performance (see the Disable write cache flushing section).
To apply the Admin UI firewall rule you created earlier, click Management, disk, networking, SSH keys, select the Networking tab, and then enter
cockroachdb
in the Network tags field.
Note the internal IP address of each
n1-highcpu-16
instance. You'll need these addresses when starting the CockroachDB nodes.Create a fourth instance for running the TPC-C benchmark.
This configuration is intended for performance benchmarking only. For production deployments, there are other important considerations, such as ensuring that data is balanced across at least three availability zones for resiliency. See the Production Checklist for more details.
Step 2. Start a 3-node cluster
SSH to the first
n1-highcpu-16
instance.Download the CockroachDB archive for Linux, extract the binary, and copy it into the
PATH
:$ curl https://binaries.cockroachdb.com/cockroach-v19.1.11.linux-amd64.tgz \ | tar -xz
$ cp -i cockroach-v19.1.11.linux-amd64/cockroach /usr/local/bin/
If you get a permissions error, prefix the command with
sudo
.Run the
cockroach start
command:$ cockroach start \ --insecure \ --advertise-addr=<node1 internal address> \ --join=<node1 internal address>,<node2 internal address>,<node3 internal address> \ --cache=.25 \ --max-sql-memory=.25 \ --background
Repeat steps 1 - 3 for the other two
n1-highcpu-16
instances. Be sure to adjust the--advertise-addr
flag each time.From the fourth
n1-highcpu-16
instance, run thecockroach init
command:$ cockroach init --insecure --host=<address of any node>
Each node then prints helpful details to the standard output, such as the CockroachDB version, the URL for the Web UI, and the SQL URL for clients.
Step 3. Load data for the benchmark
CockroachDB offers a pre-built workload
binary for Linux that includes several load generators for simulating client traffic against your cluster. This step features CockroachDB's version of the TPC-C workload.
SSH to the fourth instance (the one not running a CockroachDB node), download
workload
, and make it executable:$ wget https://edge-binaries.cockroachdb.com/cockroach/workload.LATEST ; chmod 755 workload.LATEST
Rename and copy
workload
into thePATH
:$ cp -i workload.LATEST /usr/local/bin/workload
Start the TPC-C workload, pointing it at the connection string of a node and including any connection parameters:
$ ./workload.LATEST fixtures load tpcc \ --warehouses=1000 \ "postgres://root@<node1 address>?sslmode=disable"
This command runs the TPC-C workload against the cluster. This will take about an hour and loads 1,000 "warehouses" of data.
Tip:For more
tpcc
options, useworkload run tpcc --help
. For details about other load generators included inworkload
, useworkload run --help
.To monitor the load generator's progress, follow along with the process on the Admin UI > Jobs table.
Open the Admin UI by pointing a browser to the address in the
admin
field in the standard output of any node on startup. Follow along with the process on the Admin UI > Jobs table.
Step 4. Run the benchmark
Still on the fourth instance, run workload
for five minutes against the other 3 instances:
$ ./workload.LATEST run tpcc \
--ramp=30s \
--warehouses=1000 \
--duration=300s \
--split \
--scatter \
"postgres://root@<node1 address>?sslmode=disable postgres://root@<node2 address>?sslmode=disable postgres://root@<node3 address>?sslmode=disable"
Step 5. Interpret the results
Once the workload
has finished running, you should see a final output line:
_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
298.9s 13154.0 102.3% 75.1 71.3 113.2 130.0 184.5 436.2
You will also see some audit checks and latency statistics for each individual query. For this run, some of those checks might indicate that they were SKIPPED
due to insufficient data. For a more comprehensive test, run workload
for a longer duration (e.g., two hours). The tpmC
(new order transactions/minute) number is the headline number and efc
("efficiency") tells you how close CockroachDB gets to theoretical maximum tpmC
.
The TPC-C specification has p90 latency requirements in the order of seconds, but as you see here, CockroachDB far surpasses that requirement with p90 latencies in the hundreds of milliseconds.
Benchmark a large cluster
The methodology for reproducing CockroachDB's 30-node, 10,000 warehouse TPC-C result is similar to that for the 3-node, 1,000 warehouse example. The only difference (besides the larger node count and dataset) is that you will use CockroachDB's partitioning feature to ensure replicas for any given section of data are located on the same nodes that will be queried by the load generator for that section of data. Partitioning helps distribute the workload evenly across the cluster.
Before you start
Benchmarking a large cluster uses partitioning. You must have a valid enterprise license to use partitioning features. For details about requesting and setting a trial or full enterprise license, see Enterprise Licensing.
Step 1. Create 30 Google Cloud Platform GCE instances
Create 30 instances for your CockroachDB nodes. While creating each instance:
Use the
n1-highcpu-16
machine type.For our TPC-C benchmarking, we use
n1-highcpu-16
machines. Currently, we believe this (or higher vCPU count machines) is the best configuration for CockroachDB under high traffic scenarios.-
We attach a single local SSD to each virtual machine. Local SSDs are low latency disks attached to each VM, which maximizes performance. We chose this configuration because it best resembles what a bare metal deployment would look like, with machines directly connected to one physical disk each. We do not recommend using network-attached block storage.
Optimize the local SSD for write performance (see the Disable write cache flushing section).
To apply the Admin UI firewall rule you created earlier, click Management, disk, networking, SSH keys, select the Networking tab, and then enter
cockroachdb
in the Network tags field.
Note the internal IP address of each
n1-highcpu-16
instance. You'll need these addresses when starting the CockroachDB nodes.Create a 31st instance for running the TPC-C benchmark.
This configuration is intended for performance benchmarking only. For production deployments, there are other important considerations, such as ensuring that data is balanced across at least three availability zones for resiliency. See the Production Checklist for more details.
Step 2. Start a 30-node cluster
SSH to the first
n1-highcpu-16
instance.Download the CockroachDB archive for Linux, extract the binary, and copy it into the
PATH
:$ curl https://binaries.cockroachdb.com/cockroach-v19.1.11.linux-amd64.tgz \ | tar -xz
$ cp -i cockroach-v19.1.11.linux-amd64/cockroach /usr/local/bin/
If you get a permissions error, prefix the command with
sudo
.Run the
cockroach start
command:$ cockroach start \ --insecure \ --advertise-addr=<node1 internal address> \ --join=<node1 internal address>,<node2 internal address>,<node3 internal address>, [...] \ --cache=.25 \ --max-sql-memory=.25 \ --locality=rack=1 \ --background
Each node will start with a locality that includes an artificial "rack number" (e.g.,
--locality=rack=1
). Use 10 racks for 30 nodes so that every tenth node is part of the same rack (e.g.,--locality=rack=2
,--locality=rack=3
, ...).Repeat steps 1 - 3 for the other 29
n1-highcpu-16
instances.From the 31st
n1-highcpu-16
instance, run thecockroach init
command:$ cockroach init --insecure --host=<address of any node>
Each node then prints helpful details to the standard output, such as the CockroachDB version, the URL for the Web UI, and the SQL URL for clients.
Step 3. Add an enterprise license
For this benchmark, you will use partitioning, which is an enterprise feature. For details about requesting and setting a trial or full enterprise license, see Enterprise Licensing.
To add an enterprise license to your cluster once it is started, use the built-in SQL client as follows:
SSH to the 31st instance (the one not running a CockroachDB node) and launch the built-in SQL client:
$ cockroach sql --insecure --host=localhost
Add your enterprise license:
> SET CLUSTER SETTING enterprise.license = '<secret>';
Exit the interactive shell, using
\q
orctrl-d
.
Step 4. Load data for the benchmark
CockroachDB offers a pre-built workload
binary for Linux that includes several load generators for simulating client traffic against your cluster. This step features CockroachDB's version of the TPC-C workload.
Still on the 31st instance (the one not running a CockroachDB node), download
workload
, and make it executable:$ wget https://edge-binaries.cockroachdb.com/cockroach/workload.LATEST ; chmod 755 workload.LATEST
Rename and copy
workload
into thePATH
:$ cp -i workload.LATEST /usr/local/bin/workload
Start the TPC-C workload, pointing it at the connection string of a node and including any connection parameters:
$ ./workload.LATEST fixtures load tpcc \ --warehouses=10000 \ "postgres://root@<node1 address>?sslmode=disable"
This command runs the TPC-C workload against the cluster. This will take at about an hour and loads 10,000 "warehouses" of data.
Tip:For more
tpcc
options, useworkload run tpcc --help
. For details about other load generators included inworkload
, useworkload run --help
.To monitor the load generator's progress, follow along with the process on the Admin UI > Jobs table.
Open the Admin UI by pointing a browser to the address in the
admin
field in the standard output of any node on startup.
Step 5. Increase the snapshot rate
To increase the snapshot rate, which helps speed up this large-scale data movement:
Still on the 31st instance, launch the built-in SQL client:
$ cockroach sql --insecure --host=localhost
Set the cluster setting to increase the snapshot rate:
> SET CLUSTER SETTING kv.snapshot_rebalance.max_rate='64MiB';
Exit the interactive shell, using
\q
orctrl-d
.
Step 6. Partition the database
Next, partition your database to divide all of the TPC-C tables and indexes into ten partitions, one per rack, and then use zone configurations to pin those partitions to a particular rack.
Still on the 31st instance, start the partitioning:
$ ulimit -n 10000 && workload.LATEST run tpcc \ --partitions=10 \ --split \ --scatter \ --warehouses=10000 \ --duration=1s \ "postgres://root@<node31 address>?sslmode=disable"
This command runs the TPC-C workload against the cluster for 1 second, long enough to add the partitions.
Partitioning the data will take at least 12 hours. It takes this long because all of the data (over 2TB replicated for TPC-C-10K) needs to be moved to the right locations.
To watch the progress, follow along with the process on the Admin UI > Metrics > Queues > Replication Queue graph. Change the timeframe to Last 10 Min to view a more granular graph.
Open the Admin UI by pointing a browser to the address in the
admin
field in the standard output of any node on startup.Once the Replication Queue gets to
0
for all actions and stays there, the cluster should be finished rebalancing and is ready for testing.
Step 7. Run the benchmark
Still on the 31st instance, run workload
for five minutes against the other 30 instances:
$ ulimit -n 10000 && ./workload.LATEST run tpcc \
--warehouses=10000 \
--ramp=30s \
--duration=300s \
--split \
--scatter \
"postgres://root@<node1 address>?sslmode=disable postgres://root@<node2 address>?sslmode=disable postgres://root@<node3 address>?sslmode=disable [...space separated list]"
Step 8. Interpret the results
Once the workload
has finished running, you should see a final output line similar to the output in Benchmark a small cluster. The tpmC
should be about 10x higher, reflecting the increase in the number of warehouses:
_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
291.6s 131109.8 102.0% 115.3 88.1 184.5 268.4 637.5 4295.0
You will also see some audit checks and latency statistics for each individual query. For this run, some of those checks might indicate that they were SKIPPED
due to insufficient data. For a more comprehensive test, run workload
for a longer duration (e.g., two hours). The tpmC
(new order transactions/minute) number is the headline number and efc
("efficiency") tells you how close CockroachDB gets to theoretical maximum tpmC
.
The TPC-C specification has p90 latency requirements in the order of seconds, but as you see here, CockroachDB far surpasses that requirement with p90 latencies in the hundreds of milliseconds.