In a multi-region deployment, the Regional Table Locality Pattern is a good choice for tables with the following requirements:
- Read and write latency must be low.
- Rows in the table, and all latency-sensitive queries, can be tied to specific regions.
Tables with the Regional Table Locality Pattern can survive zone or region failures, depending on the database-level survival goal setting.
For new clusters using the multi-region SQL abstractions, we recommend lowering the --max-offset
setting to 250ms
. This is especially helpful for lowering the write latency of global tables. Note that this will require restarting all of the nodes in your cluster at the same time; it cannot be done with a rolling restart.
Regional tables (and the other multi-region capabilities) require an Enterprise license.
Prerequisites
Fundamentals
Multi-region patterns require thinking about the following questions:
- What are my survival goals? Do I need to survive a zone failure? A region failure?
- What are the table localities that will provide the performance characteristics I need for each table's data?
- Do I need low-latency reads and writes from a single region? Do I need that single region to be configurable at the row level? Or will a single optimized region for the entire table suffice?
- Do I have a "read-mostly" table of reference data that is rarely updated, but that must be read with low latency from all regions?
For more information about our multi-region capabilities, review the following pages:
- Multi-region overview
- Choosing a multi-region configuration
- When to use
ZONE
vs.REGION
Survival Goals - When to use
REGIONAL
vs.GLOBAL
Tables
In addition, reviewing the following information will be helpful:
- The concept of locality, which CockroachDB uses to intelligently place and balance data based on how you define the following settings:
- The recommendations in our Production Checklist.
- This page doesn't account for hardware specifications, so be sure to follow our hardware recommendations and perform a POC to size hardware for your use case.
- Finally, adopt these SQL Best Practices to get good performance.
Cluster setup
Each multi-region pattern assumes the following setup:
Hardware
- 3 regions
- Per region, 3+ AZs with 3+ VMs evenly distributed across them
- Region-specific app instances and load balancers
- Each load balancer redirects to CockroachDB nodes in its region.
- When CockroachDB nodes are unavailable in a region, the load balancer redirects to nodes in other regions.
Cluster
Each node is started with the --locality
flag specifying its region and AZ combination. For example, the following command starts a node in the west1
AZ of the us-west
region:
$ cockroach start \
--locality=region=us-west,zone=west1 \
--certs-dir=certs \
--advertise-addr=<node1 internal address> \
--join=<node1 internal address>:26257,<node2 internal address>:26257,<node3 internal address>:26257 \
--cache=.25 \
--max-sql-memory=.25 \
--background
Configuration
Summary
To use this pattern, you tell CockroachDB to set the table locality to either REGIONAL BY TABLE
or REGIONAL BY ROW
.
Regional tables
Regional tables work well when your application requires low-latency reads and writes for an entire table from a single region.
For regional tables, access to the table will be fast in the table's "home region" and slower in other regions. In other words, CockroachDB optimizes access to data in regional tables from a single region. By default, a regional table's home region is the database's primary region, but that can be changed to use any region in the database.
For instructions showing how to set a table's locality to REGIONAL BY TABLE
, see ALTER TABLE ... SET LOCALITY
By default, all tables in a multi-region database are regional tables that use the database's primary region. Unless you know your application needs different performance characteristics than regional tables provide, there is no need to change this setting.
Regional by row tables
In regional by row tables, individual rows are optimized for access from different regions. This setting automatically divides a table and all of its indexes into partitions, with each partition optimized for access from a different region. Like regional tables, regional by row tables are optimized for access from a single region. However, that region is specified at the row level instead of applying to the whole table.
Use regional by row tables when your application requires low-latency reads and writes at a row level where individual rows are primarily accessed from a single region. For example, a users table in a global application may need to keep some users' data in specific regions for better performance.
For an example of a table that can benefit from the regional by row setting in a multi-region deployment, see the users
table from the MovR application.
For instructions showing how to set a table's locality to REGIONAL BY ROW
, see ALTER TABLE ... SET LOCALITY
Steps
By default, all tables in a multi-region database are Regional tables. Therefore, the steps below show how to set up Regional by row tables.
First, create a database and set it as the default database:
CREATE DATABASE test;
USE test;
This cluster is already deployed across three regions. Therefore, to make this database a "multi-region database", you need to issue the following SQL statement that sets the primary region:
ALTER DATABASE test PRIMARY REGION "us-east";
Every multi-region database must have a primary region. For more information, see Database regions.
Next, issue the following ADD REGION
statements to add the remaining regions to the database.
ALTER DATABASE test ADD REGION "us-west";
ALTER DATABASE test ADD REGION "us-central";
Congratulations, test
is now a multi-region database!
Next, create a users
table:
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
city STRING NOT NULL,
first_name STRING NOT NULL,
last_name STRING NOT NULL,
address STRING NOT NULL
);
By default, all tables in a multi-region cluster default to the REGIONAL BY TABLE
locality setting. To verify this, issue a SHOW CREATE
on the users
table you just created:
SHOW CREATE TABLE users;
table_name | create_statement
-------------+------------------------------------------------------------------
users | CREATE TABLE public.users (
| id UUID NOT NULL DEFAULT gen_random_uuid(),
| city STRING NOT NULL,
| first_name STRING NOT NULL,
| last_name STRING NOT NULL,
| address STRING NOT NULL,
| CONSTRAINT "primary" PRIMARY KEY (id ASC),
| FAMILY "primary" (id, city, first_name, last_name, address)
| ) LOCALITY REGIONAL BY TABLE IN PRIMARY REGION
Next, set the table's locality to REGIONAL BY ROW
using the ALTER TABLE ... SET LOCALITY
statement:
ALTER TABLE users SET LOCALITY REGIONAL BY ROW;
NOTICE: LOCALITY changes will be finalized asynchronously; further schema changes on this table may be restricted until the job completes
ALTER TABLE SET LOCALITY
Now that the table is regional by row, we need to tell CockroachDB which rows need to be optimized for access from which regions. We do this by issuing UPDATE
statements that modify the automatically created crdb_region
column.
Issue the statements below to associate each row with a home region that depends on its city
column:
UPDATE users SET crdb_region = 'us-central' WHERE city IN ('chicago', 'milwaukee', 'dallas');
UPDATE users SET crdb_region = 'us-east' WHERE city IN ('washington dc', 'boston', 'new york');
UPDATE users SET crdb_region = 'us-west' WHERE city IN ('los angeles', 'san francisco', 'seattle');
By default, the region column will get auto-assigned on insert; this is also known as "auto-homing". For more information about how the crdb_region
column works, see ALTER TABLE ... SET LOCALITY REGIONAL BY ROW
.
Note that there is a performance benefit for queries that select a single row (e.g., SELECT * FROM users WHERE email = 'anemailaddress@gmail.com'
). If 'anemailaddress@gmail.com'
is found in the local region, there is no need to search remote regions. This feature, whereby the SQL engine will avoid sending requests to nodes in other regions when it can read a value from a unique column that is stored locally, is known as locality optimized search.
A good way to check that your table locality settings are having the expected effect is by monitoring how the performance metrics of a workload change as the settings are applied to a running cluster. For a tutorial showing how table localities can improve performance metrics across a multi-region cluster, see Low Latency Reads and Writes in a Multi-Region Cluster.
Characteristics
Latency
For REGIONAL BY TABLE
tables, you get low latency for single-region writes and reads, as well as multi-region stale reads.
For REGIONAL BY ROW
tables, you get low-latency consistent multi-region reads & writes for rows which are homed in specific regions, and low-latency multi-region stale reads from all other regions.
Resiliency
Because the test
database does not specify a survival goal, it uses the default ZONE
survival goal. With the default settings, an entire availability zone (AZ) can fail without interrupting access to the database.
For more information about how to choose a database survival goal, see When to use ZONE
vs. REGION
survival goals.
Alternatives
- If rows in the table cannot be tied to specific geographies, reads must be up-to-date for business reasons or because the table is referenced by foreign keys, and the table is rarely modified, consider the
GLOBAL
Table Locality Pattern. - If your application can tolerate historical reads in some cases, consider the Follower Reads pattern.
Tutorial
For a step-by-step demonstration showing how CockroachDB's multi-region capabilities (including REGIONAL BY ROW
tables) give you low-latency reads in a distributed cluster, see the tutorial on Low Latency Reads and Writes in a Multi-Region Cluster.
See also
- Multi-Region Capabilities Overview
- Choosing a multi-region configuration
- When to use
ZONE
vs.REGION
survival goals - When to use
REGIONAL
vs.GLOBAL
tables - Multi-region SQL performance
- Migrate to Multi-region SQL
ALTER DATABASE ... SURVIVE {ZONE,REGION} FAILURE
ALTER TABLE ... SET LOCALITY ...
- Topology Patterns Overview
- Single-region
- Multi-region