The Cluster Overview page of the DB Console displays key metrics about your cluster and individual nodes. These include:
- Liveness status
- Replication status
- Uptime
- Hardware usage
If you have an Enterprise license, you can enable the Node Map view for a visual representation of your cluster's geographic layout.
Enter your email in the banner at the top to receive updates on CockroachDB releases.
Cluster Overview panel
Use the Cluster Overview panel to quickly assess the capacity and health of your cluster.
Metric | Description |
---|---|
Capacity Usage |
|
Node Status |
|
Replication Status |
|
Capacity metrics
The Cluster Overview, Node List, and Node Map display Capacity Usage by the CockroachDB store (the directory on each node where CockroachDB reads and writes its data) as a percentage of the disk space that is usable on the cluster, locality, or node.
Due to a known issue in CockroachDB v21.1, the Capacity Usage shown may not be accurate if running multiple CockroachDB nodes locally on a single node. This is fixed in CockroachDB v21.2 and later.
Usable disk space is constrained by the following:
- The maximum store size, which may be specified using the
--store
flag when starting nodes. If no store size has been explicitly set, the actual disk capacity is used as the limit. This value is displayed on the Capacity graph in the Storage dashboard. - Any disk space occupied by non-CockroachDB data. This may include the operating system and other system files, as well as the Cockroach binary itself.
The DB Console thus calculates usable disk space as the sum of empty disk space, up to the value of the maximum store size, and disk space that is already being used by CockroachDB data.
If you are testing your deployment locally with multiple CockroachDB nodes running on a single machine (this is not recommended in production), you must explicitly set the store size per node in order to display the correct capacity. Otherwise, the machine's actual disk capacity will be counted as a separate store for each node, thus inflating the computed capacity.
Node List
The Node List groups nodes by locality. The lowest-level locality tier is used to organize the Node List. Hover over a locality to see all localities for the group of nodes.
We recommend defining --locality
flags when starting nodes. CockroachDB uses locality to distribute replicas and mitigate network latency. Locality is also a prerequisite for enabling the Node Map.
Node status
Each locality and node is displayed with its current operational status.
Locality Status | Description |
---|---|
LIVE |
All nodes in the locality are live. |
WARNING |
Locality has 1 or more SUSPECT , DECOMMISSIONING , or DEAD nodes (red indicates a dead node). |
Node Status | Description |
---|---|
LIVE |
Node is online and updating its liveness record. |
SUSPECT |
Node has an unavailable liveness status. |
DECOMMISSIONING |
Node is in the process of decommissioning. |
DECOMMISSIONED |
Node has completed decommissioning, has been stopped, and has not updated its liveness record for 5 minutes. |
DEAD |
Node has not updated its liveness record for 5 minutes. |
Nodes are considered dead once they have not updated their liveness record for a certain amount of time (5 minutes by default). At this point, the automated repair process starts, wherein CockroachDB rebalances replicas from dead nodes to live nodes, using the unaffected replicas as sources.
Node details
The following details are also shown.
Column | Description |
---|---|
Node Count | Number of nodes in the locality. |
Nodes | Nodes are grouped by locality and displayed with their address and node ID (the ID is the number that is prepended by n ). Click the address to view node statistics. Hover over a row and click Logs to see the node's log. |
Uptime | Amount of time the node has been running. |
Replicas | Number of replicas on the node or in the locality. |
Capacity Usage | Percentage of usable disk space occupied by CockroachDB data on the node or in the locality. See Capacity metrics. |
Memory Usage | Memory used by CockroachDB as a percentage of the total memory on the node or in the locality. |
vCPUs | Number of vCPUs on the machine. |
Version | Build tag of the CockroachDB version installed on the node. |
Decommissioned Nodes
Nodes that have been decommissioned will be listed in the table of Recently Decommissioned Nodes, indicating that they are removed from the cluster. You can see the full history of decommissioned nodes by clicking "View all decommissioned nodes".
When you initiate the decommissioning process on a node, CockroachDB transfers all range replicas and range leases off the node so that it can be safely shut down.
Node Map (Enterprise)
The Node Map is an enterprise feature that visualizes the geographical configuration of your cluster. It requires that --locality
flags have been defined for your nodes.
For guidance on enabling and configuring the node map, see Enable the Node Map.
The Node Map uses the longitude and latitude of each locality to position the components on the map. The map is populated with locality components and node components.
Locality component
A locality component represents capacity, CPU, and QPS metrics for a given locality.
The map shows the components for the highest-level locality tier (e.g., region). You can click on the Node Count of a locality component to view any lower-level localities (e.g., availability zone).
For details on how Capacity Usage is calculated, see Capacity metrics.
On multi-core systems, the displayed CPU usage can be greater than 100%. Full utilization of 1 core is considered as 100% CPU usage. If you have n cores, then CPU usage can range from 0% (indicating an idle system) to (n * 100)% (indicating full utilization).
Node component
A node component represents capacity, CPU, and QPS metrics for a given node.
Node components are accessed by clicking on the Node Count of the lowest-level locality component.
For details on how Capacity Usage is calculated, see Capacity metrics.
On multi-core systems, the displayed CPU usage can be greater than 100%. Full utilization of 1 core is considered as 100% CPU usage. If you have n cores, then CPU usage can range from 0% (indicating an idle system) to (n * 100)% (indicating full utilization).