EXPERIMENTAL CHANGEFEED FOR
is the core implementation of changefeeds. For the Enterprise-only version, see CREATE CHANGEFEED
.
The EXPERIMENTAL CHANGEFEED FOR
statement creates a new core changefeed, which streams row-level changes to the client indefinitely until the underlying connection is closed or the changefeed is canceled. A core changefeed can watch one table or multiple tables in a comma-separated list.
For more information, see Stream Data Out of CockroachDB Using Changefeeds.
This feature is experimental.This feature is subject to change. To share feedback and/or issues, contact Support.
Required privileges
Changefeeds can only be created by superusers, i.e., members of the admin
role. The admin role exists by default with root
as the member.
Considerations
Because core changefeeds return results differently than other SQL statements, they require a dedicated database connection with specific settings around result buffering. In normal operation, CockroachDB improves performance by buffering results server-side before returning them to a client; however, result buffering is automatically turned off for core changefeeds. Core changefeeds also have different cancellation behavior than other queries: they can only be canceled by closing the underlying connection or issuing a
CANCEL QUERY
statement on a separate connection. Combined, these attributes of changefeeds mean that applications should explicitly create dedicated connections to consume changefeed data, instead of using a connection pool as most client drivers do by default.This cancellation behavior (i.e., close the underlying connection to cancel the changefeed) also extends to client driver usage; in particular, when a client driver calls
Rows.Close()
after encountering errors for a stream of rows. The pgwire protocol requires that the rows be consumed before the connection is again usable, but in the case of a core changefeed, the rows are never consumed. It is therefore critical that you close the connection, otherwise the application will be blocked forever onRows.Close()
.In most cases, each version of a row will be emitted once. However, some infrequent conditions (e.g., node failures, network partitions) will cause them to be repeated. This gives our changefeeds an at-least-once delivery guarantee. For more information, see Change Data Capture - Ordering Guarantees.
Synopsis
> EXPERIMENTAL CHANGEFEED FOR table_name [ WITH (option [= value] [, ...]) ];
Parameters
Parameter | Description |
---|---|
table_name |
The name of the table (or tables in a comma separated list) to create a changefeed for. |
option / value |
For a list of available options and their values, see Options below. |
Options
Option | Value | Description |
---|---|---|
updated |
N/A | Include updated timestamps with each row. |
resolved |
INTERVAL |
Emits resolved timestamp events for the changefeed. Resolved timestamp events do not emit until all ranges in the changefeed have progressed to a specific point in time. Set an optional minimal duration between emitting resolved timestamps. Example: resolved='10s' . This option will only emit a resolved timestamp event if the timestamp has advanced and at least the optional duration has elapsed. If unspecified, all resolved timestamps are emitted as the high-water mark advances. |
envelope |
key_only / row |
Use key_only to emit only the key and no value, which is faster if you only want to know when the key changes.Default: envelope=row |
cursor |
Timestamp | Emits any changes after the given timestamp, but does not output the current state of the table first. If cursor is not specified, the changefeed starts by doing a consistent scan of all the watched rows and emits the current value, then moves to emitting any changes that happen after the scan.cursor can be used to start a new changefeed where a previous changefeed ended.Example: CURSOR=1536242855577149065.0000000000 |
mvcc_timestamp |
N/A | New in v21.2: Include the MVCC timestamp for each emitted row in a changefeed. With the mvcc_timestamp option, each emitted row will always contain its MVCC timestamp, even during the changefeed's initial backfill. |
format |
json / avro |
Format of the emitted record. Currently, support for Avro is limited. Default: format=json . |
confluent_schema_registry |
Schema Registry address | The Schema Registry address is required to use avro . |
Avro limitations
Below are clarifications for particular SQL types and values for Avro changefeeds:
- Decimals must have precision specified.
BYTES
(or its aliasesBYTEA
andBLOB
) are often used to store machine-readable data. When you stream these types through a changefeed withformat=avro
, CockroachDB does not encode or change the data. However, Avro clients can often include escape sequences to present the data in a printable format, which can interfere with deserialization. A potential solution is to hex-encodeBYTES
values when initially inserting them into CockroachDB. This will ensure that Avro clients will consistently decode the hexadecimal. Note that hex-encoding values at insertion process will increase record size.BIT
andVARBIT
types are encoded as arrays of 64-bit integers.For efficiency, CockroachDB encodes
BIT
andVARBIT
bitfield types as arrays of 64-bit integers. That is, base-2 (binary format)BIT
andVARBIT
data types are converted to base 10 and stored in arrays. Encoding in CockroachDB is big-endian, therefore the last value may have many trailing zeroes. For this reason, the first value of each array is the number of bits that are used in the last value of the array.For instance, if the bitfield is 129 bits long, there will be 4 integers in the array. The first integer will be
1
; representing the number of bits in the last value, the second integer will be the first 64 bits, the third integer will be bits 65–128, and the last integer will either be0
or9223372036854775808
(i.e., the integer with only the first bit set, or1000000000000000000000000000000000000000000000000000000000000000
when base 2).This example is base-10 encoded into an array as follows:
{"array": [1, <first 64 bits>, <second 64 bits>, 0 or 9223372036854775808]}
For downstream processing, it is necessary to base-2 encode every element in the array (except for the first element). The first number in the array gives you the number of bits to take from the last base-2 number — that is, the most significant bits. So, in the example above this would be
1
. Finally, all the base-2 numbers can be appended together, which will result in the original number of bits, 129.In a different example of this process where the bitfield is 136 bits long, the array would be similar to the following when base-10 encoded:
{"array": [8, 18293058736425533439, 18446744073709551615, 13690942867206307840]}
To then work with this data, you would convert each of the elements in the array to base-2 numbers, besides the first element. For the above array, this would convert to:
[8, 1111110111011011111111111111111111111111111111111111111111111111, 1111111111111111111111111111111111111111111111111111111111111111, 1011111000000000000000000000000000000000000000000000000000000000]
Next, you use the first element in the array to take the number of bits from the last base-2 element,
10111110
. Finally, you append each of the base-2 numbers together — in the above array, the second, third, and truncated last element. This results in 136 bits, the original number of bits.
Examples
Create a changefeed
In this example, you'll set up a core changefeed for a single-node cluster.
In a terminal window, start
cockroach
:$ cockroach start-single-node \ --insecure \ --listen-addr=localhost \ --background
As the
root
user, open the built-in SQL client:$ cockroach sql \ --url="postgresql://root@127.0.0.1:26257?sslmode=disable" \ --format=csv
Note:Because core changefeeds return results differently than other SQL statements, they require a dedicated database connection with specific settings around result buffering. In normal operation, CockroachDB improves performance by buffering results server-side before returning them to a client; however, result buffering is automatically turned off for core changefeeds. Core changefeeds also have different cancellation behavior than other queries: they can only be canceled by closing the underlying connection or issuing a
CANCEL QUERY
statement on a separate connection. Combined, these attributes of changefeeds mean that applications should explicitly create dedicated connections to consume changefeed data, instead of using a connection pool as most client drivers do by default.Note:To determine how wide the columns need to be, the default
table
display format incockroach sql
buffers the results it receives from the server before printing them to the console. When consuming core changefeed data usingcockroach sql
, it's important to use a display format likecsv
that does not buffer its results. To set the display format, use the--format=csv
flag when starting the built-in SQL client, or set the\set display_format=csv
option once the SQL client is open.Enable the
kv.rangefeed.enabled
cluster setting:> SET CLUSTER SETTING kv.rangefeed.enabled = true;
Create table
foo
:> CREATE TABLE foo (a INT PRIMARY KEY);
Insert a row into the table:
> INSERT INTO foo VALUES (0);
Start the core changefeed:
> EXPERIMENTAL CHANGEFEED FOR foo;
table,key,value foo,[0],"{""after"": {""a"": 0}}"
In a new terminal, add another row:
$ cockroach sql --insecure -e "INSERT INTO foo VALUES (1)"
Back in the terminal where the core changefeed is streaming, the following output has appeared:
foo,[1],"{""after"": {""a"": 1}}"
Note that records may take a couple of seconds to display in the core changefeed.
To stop streaming the changefeed, enter CTRL+C into the terminal where the changefeed is running.
To stop
cockroach
:Get the process ID of the node:
ps -ef | grep cockroach | grep -v grep
501 21766 1 0 6:21PM ttys001 0:00.89 cockroach start-single-node --insecure --listen-addr=localhost
Gracefully shut down the node, specifying its process ID:
kill -TERM 21766
initiating graceful shutdown of server server drained and shutdown completed
Create a changefeed with Avro
In this example, you'll set up a core changefeed for a single-node cluster that emits Avro records. CockroachDB's Avro binary encoding convention uses the Confluent Schema Registry to store Avro schemas.
Use the
cockroach start-single-node
command to start a single-node cluster:$ cockroach start-single-node \ --insecure \ --listen-addr=localhost \ --background
Download and extract the Confluent Open Source platform.
Move into the extracted
confluent-<version>
directory and start Confluent:$ ./bin/confluent local services start
Only
zookeeper
,kafka
, andschema-registry
are needed. To troubleshoot Confluent, see their docs and the Quick Start Guide.As the
root
user, open the built-in SQL client:$ cockroach sql --url="postgresql://root@127.0.0.1:26257?sslmode=disable" --format=csv
Note:Because core changefeeds return results differently than other SQL statements, they require a dedicated database connection with specific settings around result buffering. In normal operation, CockroachDB improves performance by buffering results server-side before returning them to a client; however, result buffering is automatically turned off for core changefeeds. Core changefeeds also have different cancellation behavior than other queries: they can only be canceled by closing the underlying connection or issuing a
CANCEL QUERY
statement on a separate connection. Combined, these attributes of changefeeds mean that applications should explicitly create dedicated connections to consume changefeed data, instead of using a connection pool as most client drivers do by default.Note:To determine how wide the columns need to be, the default
table
display format incockroach sql
buffers the results it receives from the server before printing them to the console. When consuming core changefeed data usingcockroach sql
, it's important to use a display format likecsv
that does not buffer its results. To set the display format, use the--format=csv
flag when starting the built-in SQL client, or set the\set display_format=csv
option once the SQL client is open.Enable the
kv.rangefeed.enabled
cluster setting:> SET CLUSTER SETTING kv.rangefeed.enabled = true;
Create table
bar
:> CREATE TABLE bar (a INT PRIMARY KEY);
Insert a row into the table:
> INSERT INTO bar VALUES (0);
Start the core changefeed:
> EXPERIMENTAL CHANGEFEED FOR bar WITH format = avro, confluent_schema_registry = 'http://localhost:8081';
table,key,value bar,\000\000\000\000\001\002\000,\000\000\000\000\002\002\002\000
In a new terminal, add another row:
$ cockroach sql --insecure -e "INSERT INTO bar VALUES (1)"
Back in the terminal where the core changefeed is streaming, the output will appear:
bar,\000\000\000\000\001\002\002,\000\000\000\000\002\002\002\002
Note that records may take a couple of seconds to display in the core changefeed.
To stop streaming the changefeed, enter CTRL+C into the terminal where the changefeed is running.
To stop
cockroach
:Get the process ID of the node:
ps -ef | grep cockroach | grep -v grep
501 21766 1 0 6:21PM ttys001 0:00.89 cockroach start-single-node --insecure --listen-addr=localhost
Gracefully shut down the node, specifying its process ID:
kill -TERM 21766
initiating graceful shutdown of server server drained and shutdown completed
To stop Confluent, move into the extracted
confluent-<version>
directory and stop Confluent:$ ./bin/confluent local services stop
To terminate all Confluent processes, use:
$ ./bin/confluent local destroy