Stop a Node

On this page

Warning:

As of April 12, 2019, CockroachDB v1.1 is no longer supported. For more details, refer to the Release Support Policy.

This page shows you how to use the cockroach quit command to temporarily stop a node that you plan to restart, for example, during the process of upgrading your cluster's version of CockroachDB.

For information about permanently removing nodes to downsize a cluster or react to hardware failures, see Remove Nodes.

Overview

How It Works

Cancels all current sessions without waiting.
Transfers all range leases and Raft leadership to other nodes.
Gossips its draining state to the cluster so that no leases are transferred to the draining node. Note that this is a best effort, so other nodes may not receive the gossip info in time.
No new ranges are transferred to the draining node, to avoid a possible loss of quorum after the node shuts down.

If the node then stays offline for a certain amount of time (5 minutes by default), the cluster considers the node dead and starts to transfer its range replicas to other nodes as well.

After that, if the node comes back online, its range replicas will determine whether or not they are still valid members of replica groups. If a range replica is still valid and any data in its range has changed, it will receive updates from another replica in the group. If a range replica is no longer valid, it will be removed from the node.

Basic terms:

Range: CockroachDB stores all user data and almost all system data in a giant sorted map of key value pairs. This keyspace is divided into "ranges", contiguous chunks of the keyspace, so that every key can always be found in a single range.
Range Replica: CockroachDB replicates each range (3 times by default) and stores each replica on a different node.
Range Lease: For each range, one of the replicas holds the "range lease". This replica, referred to as the "leaseholder", is the one that receives and coordinates all read and write requests for the range.

Considerations

As mentioned above, by default, if a node stays offline for more than 5 minutes, the cluster will consider it dead and will rebalance its data to other nodes. Therefore, before temporarily stopping nodes, if you expect any node to be offline for longer than 5 minutes, you should first set the server.time_until_store_dead cluster setting to higher than the 5m0s default.

For example, let's say you're upgrading system software on a group of servers, and the nodes running on the servers may be offline for up to 15 minutes as a result. Before shutting down the nodes, you would change the server.time_until_store_dead cluster setting as follows:

> SET CLUSTER SETTING server.time_until_store_dead = '15m0s';

After completing the system upgrades and restarting the nodes, you would then change the setting back to its default:

> SET CLUSTER SETTING server.time_until_store_dead = '5m0s';

Synopsis

# Temporarily stop a node:
$ cockroach quit <flags>

# View help:
$ cockroach quit --help

Flags

The quit command supports the following general-use and logging flags.

General

Flag	Description
`--decommission`	If specified, the node will be permanently removed instead of temporarily stopped. See Remove Nodes for more details.

Client Connection

Flag	Description
`--host`	The server host to connect to. This can be the address of any node in the cluster. Env Variable: `COCKROACH_HOST` Default:`localhost`
`--port` `-p`	The server port to connect to. Env Variable: `COCKROACH_PORT` Default: `26257`
`--user` `-u`	The SQL user that will own the client session. Env Variable: `COCKROACH_USER` Default: `root`
`--insecure`	Use an insecure connection. Env Variable: `COCKROACH_INSECURE` Default: `false`
`--certs-dir`	The path to the certificate directory containing the CA and client certificates and client key. Env Variable: `COCKROACH_CERTS_DIR` Default: `${HOME}/.cockroach-certs/`

See Client Connection Parameters for more details.

Logging

By default, the quit command logs errors to stderr.

If you need to troubleshoot this command's behavior, you can change its logging behavior.

Examples

Stop a Node from the Machine Where It's Running

SSH to the machine where the node is running.
If the node is running in the background and you are using a process manager for automatic restarts, use the process manager to stop the cockroach process without restarting it.

If the node is running in the background and you are not using a process manager, send a kill signal to the cockroach process, for example:
```
$ pkill cockroach
```
If the node is running in the foreground, press CTRL-C.
Verify that the cockroach process has stopped:
```
$ ps aux | grep cockroach
```
Alternately, you can check the node's logs for the message server drained and shutdown completed.

Stop a Node from Another Machine

Install the cockroach binary on a machine separate from the node.
Create a certs directory and copy the CA certificate and the client certificate and key for the root user into the directory.

Run the cockroach quit command without the --decommission flag:

$ cockroach quit --certs-dir=certs --host=<address of node to stop>

Install the cockroach binary on a machine separate from the node.

Run the cockroach quit command without the --decommission flag:

$ cockroach quit --insecure --host=<address of node to stop>

Cockroach
University

Docs Hub

Stop a Node

Overview

How It Works

Considerations

Synopsis

Flags

General

Client Connection

Logging

Examples

Stop a Node from the Machine Where It's Running

Stop a Node from Another Machine

See Also

Cockroach University

Docs Hub

Cockroach University

Docs Hub

Stop a Node

Overview

How It Works

Considerations

Synopsis

Flags

General

Client Connection

Logging

Examples

Stop a Node from the Machine Where It's Running

Stop a Node from Another Machine

See Also

Cockroach
University

Cockroach
University