This page provides important recommendations for production deployments of CockroachDB.
Cluster Topology
For a replicated cluster, each replica will be on a different node and a majority of replicas must remain available for the cluster to make progress. Therefore:
Use at least 3 nodes to ensure that a majority of replicas (2/3) remains available if a node fails.
Run each node on a separate machine. Since CockroachDB replicates across nodes, running more than one node per machine increases the risk of data loss if a machine fails. Likewise, if a machine has multiple disks or SSDs, run one node with multiple
--store
flags and not one node per disk. For more details about stores, see Start a Node.Configurations with odd numbers of replicas are more robust than those with even numbers. Configurations with three replicas and configurations with four replicas can each tolerate one node failure and still reach a majority (2/3 and 3/4 respectively), so the fourth replica doesn't add any extra fault-tolerance. To survive two simultaneous failures, you must have five replicas.
When replicating across datacenters, it's recommended to specify which datacenter each node is in using the
--locality
flag to ensure even replication (see this example for more details). If some of your datacenters are much farther apart than others, specifying multiple levels of locality (such as country and region) is recommended.When replicating across datacenters, be aware that the round trip latency between datacenters will have a direct effect on your database's performance, with cross-continent clusters performing noticeably worse than clusters in which all nodes are geographically close together.
For details about controlling the number and location of replicas, see Configure Replication Zones.
Hardware
Basic Recommendations
Nodes should have sufficient CPU, RAM, network, and storage capacity to handle your workload. It's important to test and tune your hardware setup before deploying to production.
At a bare minimum, each node should have 2 GB of RAM and one entire core. More data, complex workloads, higher concurrency, and faster performance require additional resources.
Warning:Avoid "burstable" or "shared-core" virtual machines that limit the load on a single core.For best performance:
- Use SSDs over HDDs.
- Use larger/more powerful nodes. Adding more CPU is usually more beneficial than adding more RAM.
For best resilience:
- Use many smaller nodes instead of fewer larger ones. Recovery from a failed node is faster when data is spread across more nodes.
- Use zone configs to increase the replication factor from 3 (the default) to 5. This is especially recommended if you are using local disks rather than a cloud providers' network-attached disks that are often replicated underneath the covers, because local disks have a greater risk of failure. You can do this for the entire cluster or for specific databases or tables.
Cloud-Specific Recommendations
Cockroach Labs recommends the following cloud-specific configurations based on our own internal testing. Before using configurations not recommended here, be sure to test them exhaustively.
AWS
- Use
m
(general purpose),c
(compute-optimized), ori
(storage-optimized) instances. For example, Cockroach Labs has usedm3.large
instances (2 vCPUs and 7.5 GiB of RAM per instance) for internal testing. - Do not use "burstable"
t2
instances, which limit the load on a single core. - Use Provisioned IOPS SSD-backed (io1) EBS volumes or SSD Instance Store volumes.
Azure
- Use storage-optimized Ls-series VMs. For example, Cockroach Labs has used
Standard_L4s
VMs (4 vCPUs and 32 GiB of RAM per VM) for internal testing. - Use Premium Storage or local SSD storage with a Linux filesystem such as
ext4
(not the Windowsntfs
filesystem). Note that the size of a Premium Storage disk affects its IOPS. - If you choose local SSD storage, on reboot, the VM can come back with the
ntfs
filesystem. Be sure your automation monitors for this and reformats the disk to the Linux filesystem you chose initially. - Do not use "burstable" B-series VMs, which limit the load on a single core. Also, Cockroach Labs has experienced data corruption issues on A-series VMs and irregular disk performance on D-series VMs, so we recommend avoiding those as well.
Digital Ocean
- Use any droplets except standard droplets with only 1 GB of RAM, which is below our minimum requirement. All Digital Ocean droplets use SSD storage.
GCE
- Use
n1-standard
orn1-highcpu
predefined VMs, or custom VMs. For example, Cockroach Labs has used custom VMs (8 vCPUs and 16 GiB of RAM per VM) for internal testing. - Do not use
f1
org1
shared-core machines, which limit the load on a single core. - Use Local SSDs or SSD persistent disks. Note that the IOPS of SSD persistent disks depends both on the disk size and number of CPUs on the machine.
Load Balancing
Each CockroachDB node is an equally suitable SQL gateway to a cluster, but to ensure client performance and reliability, it's important to use load balancing:
Performance: Load balancers spread client traffic across nodes. This prevents any one node from being overwhelmed by requests and improves overall cluster performance (queries per second).
Reliability: Load balancers decouple client health from the health of a single CockroachDB node. In cases where a node fails, the load balancer redirects client traffic to available nodes.
Tip:With a single load balancer, client connections are resilient to node failure, but the load balancer itself is a point of failure. It's therefore best to make load balancing resilient as well by using multiple load balancing instances, with a mechanism like floating IPs or DNS to select load balancers for clients.
For guidance on load balancing, see the tutorial for your deployment environment:
Environment | Featured Approach |
---|---|
On-Premises | Use HAProxy. |
AWS | Use Amazon's managed load balancing service. |
Azure | Use Azure's managed load balancing service. |
Digital Ocean | Use Digital Ocean's managed load balancing service. |
GCE | Use GCE's managed TCP proxy load balancing service. |
Monitoring and Alerting
Despite CockroachDB's various built-in safeguards against failure, it is critical to actively monitor the overall health and performance of a cluster running in production and to create alerting rules that promptly send notifications when there are events that require investigation or intervention.
For details about available monitoring options and the most important events and metrics to alert on, see Monitoring and Alerting.
Clock Synchronization
CockroachDB requires moderate levels of clock synchronization to preserve data consistency. For this reason, when a node detects that its clock is out of sync with at least half of the other nodes in the cluster by 80% of the maximum offset allowed (500ms by default), it spontaneously shuts down. While serializable consistency is maintained regardless of clock skew, skew outside the configured clock offset bounds can result in violations of single-key linearizability between causally dependent transactions. It's therefore important to prevent clocks from drifting too far by running NTP or other clock synchronization software on each node.
The one rare case to note is when a node's clock suddenly jumps beyond the maximum offset before the node detects it. Although extremely unlikely, this could occur, for example, when running CockroachDB inside a VM and the VM hypervisor decides to migrate the VM to different hardware with a different time. In this case, there can be a small window of time between when the node's clock becomes unsynchronized and when the node spontaneously shuts down. During this window, it would be possible for a client to read stale data and write data derived from stale reads.
For guidance on synchronizing clocks, see the tutorial for your deployment environment:
Environment | Featured Approach |
---|---|
On-Premises | Use NTP with Google's external NTP service. |
AWS | Use the Amazon Time Sync Service. |
Azure | Disable Hyper-V time synchronization and use NTP with Google's external NTP service. |
Digital Ocean | Use NTP with Google's external NTP service. |
GCE | Use NTP with Google's internal NTP service. |
Cache and SQL Memory Size Changed in v1.1
By default, each node's cache size and temporary SQL memory size is 128MiB
respectively. These defaults were chosen to facilitate development and testing, where users are likely to run multiple CockroachDB nodes on a single computer. When running a production cluster with one node per host, however, it's recommended to increase these values:
- Increasing a node's cache size will improve the node's read performance.
- Increasing a node's SQL memory size will increase the number of simultaneous client connections it allows (the
128MiB
default allows a maximum of 6200 simultaneous connections) as well as the node's capacity for in-memory processing of rows when usingORDER BY
,GROUP BY
,DISTINCT
, joins, and window functions.
To manually increase a node's cache size and SQL memory size, start the node using the --cache
and --max-sql-memory
flags:
$ cockroach start --cache=25% --max-sql-memory=25% <other start flags>
File Descriptors Limit
CockroachDB can use a large number of open file descriptors, often more than is available by default. Therefore, please note the following recommendations.
For each CockroachDB node:
- At a minimum, the file descriptors limit must be 1956 (1700 per store plus 256 for networking). If the limit is below this threshold, the node will not start.
- It is recommended to set the file descriptors limit to unlimited; otherwise, the recommended limit is at least 15000 (10000 per store plus 5000 for networking). This higher limit ensures performance and accommodates cluster growth.
- When the file descriptors limit is not high enough to allocate the recommended amounts, CockroachDB allocates 10000 per store and the rest for networking; if this would result in networking getting less than 256, CockroachDB instead allocates 256 for networking and evenly splits the rest across stores.
Increase the File Descriptors Limit
Yosemite and later
To adjust the file descriptors limit for a single process in Mac OS X Yosemite and later, you must create a property list configuration file with the hard limit set to the recommendation mentioned above. Note that CockroachDB always uses the hard limit, so it's not technically necessary to adjust the soft limit, although we do so in the steps below.
For example, for a node with 3 stores, we would set the hard limit to at least 35000 (10000 per store and 5000 for networking) as follows:
Check the current limits:
$ launchctl limit maxfiles maxfiles 10240 10240
The last two columns are the soft and hard limits, respectively. If
unlimited
is listed as the hard limit, note that the hidden default limit for a single process is actually 10240.Create
/Library/LaunchDaemons/limit.maxfiles.plist
and add the following contents, with the final strings in theProgramArguments
array set to 35000:<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>limit.maxfiles</string> <key>ProgramArguments</key> <array> <string>launchctl</string> <string>limit</string> <string>maxfiles</string> <string>35000</string> <string>35000</string> </array> <key>RunAtLoad</key> <true/> <key>ServiceIPC</key> <false/> </dict> </plist>
Make sure the plist file is owned by
root:wheel
and has permissions-rw-r--r--
. These permissions should be in place by default.Restart the system for the new limits to take effect.
Check the current limits:
$ launchctl limit maxfiles maxfiles 35000 35000
Older versions
To adjust the file descriptors limit for a single process in OS X versions earlier than Yosemite, edit /etc/launchd.conf
and increase the hard limit to the recommendation mentioned above. Note that CockroachDB always uses the hard limit, so it's not technically necessary to adjust the soft limit, although we do so in the steps below.
For example, for a node with 3 stores, we would set the hard limit to at least 35000 (10000 per store and 5000 for networking) as follows:
Check the current limits:
$ launchctl limit maxfiles maxfiles 10240 10240
The last two columns are the soft and hard limits, respectively. If
unlimited
is listed as the hard limit, note that the hidden default limit for a single process is actually 10240.Edit (or create)
/etc/launchd.conf
and add a line that looks like the following, with the last value set to the new hard limit:limit maxfiles 35000 35000
Save the file, and restart the system for the new limits to take effect.
Verify the new limits:
$ launchctl limit maxfiles maxfiles 35000 35000
Per-Process Limit
To adjust the file descriptors limit for a single process on Linux, enable PAM user limits and set the hard limit to the recommendation mentioned above. Note that CockroachDB always uses the hard limit, so it's not technically necessary to adjust the soft limit, although we do so in the steps below.
For example, for a node with 3 stores, we would set the hard limit to at least 35000 (10000 per store and 5000 for networking) as follows:
Make sure the following line is present in both
/etc/pam.d/common-session
and/etc/pam.d/common-session-noninteractive
:session required pam_limits.so
Edit
/etc/security/limits.conf
and append the following lines to the file:* soft nofile 35000 * hard nofile 35000
Note that
*
can be replaced with the username that will be running the CockroachDB server.Save and close the file.
Restart the system for the new limits to take effect.
Verify the new limits:
$ ulimit -a
Alternately, if you're using Systemd:
Edit the service definition to configure the maximum number of open files:
[Service] ... LimitNOFILE=35000
Reload Systemd for the new limit to take effect:
$ systemctl daemon-reload
System-Wide Limit
You should also confirm that the file descriptors limit for the entire Linux system is at least 10 times higher than the per-process limit documented above (e.g., at least 150000).
Check the system-wide limit:
$ cat /proc/sys/fs/file-max
If necessary, increase the system-wide limit in the
proc
file system:$ echo 150000 > /proc/sys/fs/file-max
CockroachDB does not yet provide a Windows binary. Once that's available, we will also provide documentation on adjusting the file descriptors limit on Windows.
Attributions
This section, "File Descriptors Limit", is in part derivative of the chapter Open File Limits From the Riak LV 2.1.4 documentation, used under Creative Commons Attribution 3.0 Unported License.
Orchestration / Kubernetes
When running CockroachDB on Kubernetes, making the following minimal customizations will result in better, more reliable performance:
- Use SSDs instead of traditional HDDs.
- Configure CPU and memory resource requests and limits.
For more information and additional customization suggestions, see our full detailed guide to CockroachDB Performance on Kubernetes.