Because of CockroachDB's multi-active availability design, you can perform a "rolling upgrade" of your CockroachDB cluster. This means that you can upgrade nodes one at a time without interrupting the cluster's overall health and operations.
To upgrade within the v1.0.x series, see the v1.0 version of this page.
Step 1. Prepare to upgrade
Before starting the upgrade, complete the following steps.
Make sure your cluster is behind a load balancer, or your clients are configured to talk to multiple nodes. If your application communicates with a single node, stopping that node to upgrade its CockroachDB binary will cause your application to fail.
Verify the cluster's overall health by running the
cockroach node status
command against any node in the cluster.In the response:
- If any nodes that should be live are not listed, identify why the nodes are offline and restart them before begining your upgrade.
- Make sure the
build
field shows the same version of CockroachDB for all nodes. If any nodes are behind, upgrade them to the cluster's current version first, and then start this process over. - Make sure
ranges_unavailable
andranges_underreplicated
show0
for all nodes. If there are unavailable or underreplicated ranges in your cluster, performing a rolling upgrade increases the risk that ranges will lose a majority of their replicas and cause cluster unavailability. Therefore, it's important to identify and resolve the cause of range unavailability and underreplication before beginning your upgrade.Note:When upgrading within the v1.1.x series, pass the--ranges
or--all
flag to include these range details in the response.
Capture the cluster's current state by running the
cockroach debug zip
command against any node in the cluster. If the upgrade does not go according to plan, the captured details will help you and Cockroach Labs troubleshoot issues.Back up the cluster. If the upgrade does not go according to plan, you can use the data to restore your cluster to its previous state.
Step 2. Perform the rolling upgrade
For each node in your cluster, complete the following steps.
Connect to the node.
Terminate the
cockroach
process.Without a process manager, use this command:
$ pkill cockroach
Then verify that the process has stopped:
$ ps aux | grep cockroach
Alternately, you can check the node's logs for the message
server drained and shutdown completed
.Download and install the CockroachDB binary you want to use:
$ curl https://binaries.cockroachdb.com/cockroach-v1.1.9.darwin-10.9-amd64.tgz
$ tar -xzf cockroach-v1.1.9.darwin-10.9-amd64.tgz
$ curl https://binaries.cockroachdb.com/cockroach-v1.1.9.linux-amd64.tgz
$ tar -xzf cockroach-v1.1.9.linux-amd64.tgz
If you use
cockroach
in your$PATH
, rename the outdatedcockroach
binary, and then move the new one into its place:i="$(which cockroach)"; mv "$i" "$i"_old
$ cp -i cockroach-v1.1.9.darwin-10.9-amd64/cockroach /usr/local/bin/cockroach
i="$(which cockroach)"; mv "$i" "$i"_old
$ cp -i cockroach-v1.1.9.linux-amd64/cockroach /usr/local/bin/cockroach
If you're running with a process manager, have the node rejoin the cluster by starting it.
Without a process manager, use this command:
$ cockroach start --join=[IP address of any other node] [other flags]
[other flags]
includes any flags you use to a start node, such as it--host
.Verify the node has rejoined the cluster through its output to
stdout
or through the admin UI.If you use
cockroach
in your$PATH
, you can remove the old binary:$ rm /usr/local/bin/cockroach_old
If you leave versioned binaries on your servers, you do not need to do anything.
Wait at least one minute after the node has rejoined the cluster, and then repeat these steps for the next node.
Step 3. Monitor the upgraded cluster
After upgrading all nodes in the cluster, monitor the cluster's stability and performance for at least one day.
Step 4. Finalize or revert the upgrade
Once you have monitored the upgraded cluster for at least one day:
If you are satisfied with the new version, complete the steps under Finalize the upgrade.
If you are experiencing problems, follow the steps under Revert the upgrade.
Finalize the upgrade
Start the
cockroach sql
shell against any node in the cluster and execute the following query:> SET CLUSTER SETTING version = '1.1';
Note:This step assumes you've upgraded to at least v1.1.1.This step enables certain performance improvements and bug fixes that were introduced in v1.1. Note, however, that after completing this step, it will no longer be possible to perform a rolling downgrade to v1.0. In the event of a catastrophic failure or corruption due to usage of new features requiring v1.1, the only option is to start a new cluster using the old binary and then restore from one of the backups created prior to finalizing the upgrade.
Revert the upgrade
Run the
cockroach debug zip
command against any node in the cluster to capture your cluster's state.Reach out for support from Cockroach Labs, sharing your debug zip.
If necessary, downgrade the cluster by repeating the rolling upgrade process, but this time switching each node back to the previous version.