CockroachDB's online schema changes provide a simple way to update a table schema without imposing any negative consequences on an application — including downtime. The schema change engine is a built-in feature requiring no additional tools, resources, or ad hoc sequencing of operations.
Benefits of online schema changes include:
- Changes to your table schema happen while the database is running.
- The schema change runs as a background job without holding locks on the underlying table data.
- Your application's queries can run normally, with no effect on read/write latency. The schema is cached for performance.
- Your data is kept in a safe, consistent state throughout the entire schema change process.
Support for schema changes within transactions is limited. We recommend doing schema changes outside transactions where possible. When a schema management tool uses transactions on your behalf, we recommend only doing one schema change operation per transaction.
How online schema changes work
At a high level, online schema changes are accomplished by using a bridging strategy involving concurrent uses of multiple versions of the schema. The process is as follows:
A user initiates a schema change by executing
ALTER TABLE
,CREATE INDEX
,TRUNCATE
, etc.The schema change engine converts the original schema to the new schema in discrete steps while ensuring that the underlying table data is always in a consistent state. These changes are executed as a background job.
This approach allows the schema change engine to roll out a new schema while the previous version is still in use. It then backfills or deletes the underlying table data as needed in the background, while the cluster is still running and servicing reads and writes from your application.
During the backfilling process, the schema change engine updates the underlying table data to make sure all instances of the table are stored according to the requirements of the new schema.
Once backfilling is complete, all nodes will switch over to the new schema, and will allow reads and writes of the table using the new schema.
For more technical details, see How online schema changes are possible in CockroachDB.
Examples
For more examples of schema change statements, see the ALTER TABLE
subcommands.
Run schema changes inside a transaction with CREATE TABLE
As noted in Limitations, you cannot run schema changes inside transactions in general.
However, as of version 2.1, you can run schema changes inside the same transaction as a CREATE TABLE
statement. For example:
> BEGIN;
SAVEPOINT cockroach_restart;
CREATE TABLE fruits (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name STRING,
color STRING
);
INSERT INTO fruits (name, color) VALUES ('apple', 'red');
ALTER TABLE fruits ADD COLUMN inventory_count INTEGER DEFAULT 5;
ALTER TABLE fruits ADD CONSTRAINT name CHECK (name IN ('apple', 'banana', 'orange'));
SELECT name, color, inventory_count FROM fruits;
RELEASE SAVEPOINT cockroach_restart;
COMMIT;
The transaction succeeds with the following output:
BEGIN
SAVEPOINT
CREATE TABLE
INSERT 0 1
ALTER TABLE
ALTER TABLE
+-------+-------+-----------------+
| name | color | inventory_count |
+-------+-------+-----------------+
| apple | red | 5 |
+-------+-------+-----------------+
(1 row)
COMMIT
COMMIT
Show all schema change jobs
You can check on the status of the schema change jobs on your system at any time using the SHOW JOBS
statement:
> SELECT * FROM [SHOW JOBS] WHERE job_type = 'SCHEMA CHANGE';
+--------------------+---------------+-----------------------------------------------------------------------------+-----------+-----------+----------------------------+----------------------------+----------------------------+----------------------------+--------------------+-------+----------------+
| job_id | job_type | description | user_name | status | created | started | finished | modified | fraction_completed | error | coordinator_id |
|--------------------+---------------+-----------------------------------------------------------------------------+-----------+-----------+----------------------------+----------------------------+----------------------------+----------------------------+--------------------+-------+----------------|
| 368863345707909121 | SCHEMA CHANGE | ALTER TABLE test.public.fruits ADD COLUMN inventory_count INTEGER DEFAULT 5 | root | succeeded | 2018-07-26 20:55:59.698793 | 2018-07-26 20:55:59.739032 | 2018-07-26 20:55:59.816007 | 2018-07-26 20:55:59.816008 | 1 | | NULL |
| 370556465994989569 | SCHEMA CHANGE | ALTER TABLE test.public.foo ADD COLUMN bar VARCHAR | root | pending | 2018-08-01 20:27:38.708813 | NULL | NULL | 2018-08-01 20:27:38.708813 | 0 | | NULL |
| 370556522386751489 | SCHEMA CHANGE | ALTER TABLE test.public.foo ADD COLUMN bar VARCHAR | root | pending | 2018-08-01 20:27:55.830832 | NULL | NULL | 2018-08-01 20:27:55.830832 | 0 | | NULL |
+--------------------+---------------+-----------------------------------------------------------------------------+-----------+-----------+----------------------------+----------------------------+----------------------------+----------------------------+--------------------+-------+----------------+
(1 row)
Limitations
Overview
Schema changes keep your data consistent at all times, but they do not run inside transactions in the general case. This is necessary so the cluster can remain online and continue to service application reads and writes.
Specifically, this behavior is necessary because making schema changes transactional would mean requiring a given schema change to propagate across all the nodes of a cluster. This would block all user-initiated transactions being run by your application, since the schema change would have to commit before any other transactions could make progress. This would prevent the cluster from servicing reads and writes during the schema change, requiring application downtime.
As of version 2.1, you can run schema changes inside the same transaction as a CREATE TABLE
statement. For more information, see this example.
No schema changes within transactions
Within a single transaction:
- DDL statements cannot be mixed with DML statements. As a workaround, you can split the statements into separate transactions. For more details, see examples of unsupported statements.
- A
CREATE TABLE
statement containingFOREIGN KEY
orINTERLEAVE
clauses cannot be followed by statements that reference the new table. - A table cannot be dropped and then recreated with the same name. This is not possible within a single transaction because
DROP TABLE
does not immediately drop the name of the table. As a workaround, split theDROP TABLE
andCREATE TABLE
statements into separate transactions.
No schema changes between executions of prepared statements
When the schema of a table targeted by a prepared statement changes after the prepared statement is created, future executions of the prepared statement could result in an error. For example, adding a column to a table referenced in a prepared statement with a SELECT *
clause will result in an error:
CREATE TABLE users (id INT PRIMARY KEY);
PREPARE prep1 AS SELECT * FROM users;
ALTER TABLE users ADD COLUMN name STRING;
INSERT INTO users VALUES (1, 'Max Roach');
EXECUTE prep1;
ERROR: cached plan must not change result type
SQLSTATE: 0A000
It's therefore recommended to explicitly list result columns instead of using SELECT *
in prepared statements, when possible.
Examples of statements that fail
The following statements fail due to the no schema changes within transactions limitation.
Create an index and then run a select against that index inside a transaction
> CREATE TABLE foo (id INT PRIMARY KEY, name VARCHAR);
BEGIN;
SAVEPOINT cockroach_restart;
CREATE INDEX foo_idx ON foo (id, name);
SELECT * from foo@foo_idx;
RELEASE SAVEPOINT cockroach_restart;
COMMIT;
CREATE TABLE
BEGIN
SAVEPOINT
CREATE INDEX
ERROR: relation "foo_idx" does not exist
ERROR: current transaction is aborted, commands ignored until end of transaction block
ROLLBACK
Add a column and then add a constraint against that column inside a transaction
> CREATE TABLE foo ();
BEGIN;
SAVEPOINT cockroach_restart;
ALTER TABLE foo ADD COLUMN bar VARCHAR;
ALTER TABLE foo ADD CONSTRAINT bar CHECK (foo IN ('a', 'b', 'c', 'd'));
RELEASE SAVEPOINT cockroach_restart;
COMMIT;
CREATE TABLE
BEGIN
SAVEPOINT
ALTER TABLE
ERROR: column "foo" not found for constraint "foo"
ERROR: current transaction is aborted, commands ignored until end of transaction block
ROLLBACK
Add a column and then select against that column inside a transaction
> CREATE TABLE foo ();
BEGIN;
SAVEPOINT cockroach_restart;
ALTER TABLE foo ADD COLUMN bar VARCHAR;
SELECT bar FROM foo;
RELEASE SAVEPOINT cockroach_restart;
COMMIT;
CREATE TABLE
BEGIN
SAVEPOINT
ALTER TABLE
ERROR: column name "bar" not found
ERROR: current transaction is aborted, commands ignored until end of transaction block
ROLLBACK
See also
- How online schema changes are possible in CockroachDB: Blog post with more technical details about how our schema change engine works.
ALTER DATABASE
ALTER INDEX
ALTER RANGE
ALTER SEQUENCE
ALTER TABLE
ALTER VIEW
CREATE DATABASE
CREATE INDEX
CREATE SEQUENCE
CREATE TABLE
CREATE VIEW
DROP DATABASE
DROP INDEX
DROP SEQUENCE
DROP TABLE
DROP VIEW
TRUNCATE