Connection Pooling

On this page Carat arrow pointing down
Warning:
As of June 5, 2024, CockroachDB v22.2 is no longer supported. For more details, refer to the Release Support Policy.

This page has information on planning, configuring, and using connection pools when using drivers or frameworks with CockroachDB.

Note:

In our sizing and production guidance, 1 vCPU is considered equivalent to 1 core in the underlying hardware platform.

About connection pools

A typical database operation consists of several steps.

  1. The driver uses a configuration to start a connection to the database server.
  2. A network socket is opened on the client that connects to the database server.
  3. Data is read or written through the network socket.
  4. The connection is closed down.
  5. The network socket is closed down and its resources are freed.

For simple database operations, these steps are not expensive, but as an application scales up, the performance of the application will suffer as each connection is created and destroyed. One pattern for improving performance is a connection pool, a group of already configured and established network connections between the client and database that can be reused for data operations within an application.

Each time an application reads or writes data, it will request one of the connections in the pool. After the data operation completes, the connection is returned to the pool so other operations can use it.

Connection pooling can be a enabled as a feature of the driver, a separate library used in conjunction with a driver, a feature of an application server, or a proxy server that acts as a gateway to the database server.

Sizing a connection pool is critical to maximizing application performance. Too few connections and the application will be blocked, waiting for the connection pool to establish a new connection, or for an existing connection to become available. Conversely, a connection pool with too many connections actively executing a query at the same time can also result in high latency as the cluster cycles through every connection concurrently, decreasing efficiency as the cluster incurs a cost every time it context-switches through every query. For example, the time it takes for many threads to complete many queries in parallel is typically higher than that for a smaller number of threads scaled to the number of available vCPUs or IOPs.

Tip:

To read more about connection pooling, see our What is Connection Pooling, and Why Should You Care blog post.

These are Cockroach Labs recommendations for settings common to most connection pooling software. If your application has multiple workers, each worker should be configured based on the kinds of queries required for its workload.

Set the maximum lifetime of connections

The maximum lifetime of a connection should be set to between 5 and 30 minutes. Dedicated and Serverless support 30 minutes as the maximum connection lifetime. When a node is shut down or restarted, client connections can be reset after 30 minutes, causing a disruption to applications.

Cockroach Labs recommends starting with a 5 minute maximum connection lifetime and increasing the connection lifetime if there is an impact on tail latency, normally seen when there are large numbers of connections to a cluster. Setting the connection lifetime below 5 minutes is possible, but there is little benefit, and comes at a cost of increased CPU usage for clients and servers. The maximum connection lifetime changes the rate of new connections per second (i.e., average new connections per second = total connections / maximum connection age).

Configure your connection pooling with connection jitter to prevent connection storms.

Set the maximum number of open connections

The maximum number of open connections should be set to a value that is greater than the maximum number of concurrent queries that your application worker expects to make. Doing so minimizes the need to establish new connections.

Set the maximum number of idle connections

Set the maximum number of idle connections to the same value as the maximum number of open connections.

Set the maximum lifetime of idle connections

Set the maximum lifetime of idle connections to the same value as the maximum lifetime of connections.

Optimize your connection pool

Follow these additional recommendations optimize the performance of your connection pool software.

Avoid spikes in new connections

If possible configure your connection pool to avoid periodic spikes in new connections, also known as "thundering herds" or "connection storms." Most connection pools offer the ability to stagger their connection age through the use of a connection lifetime "jitter," where the wait time between new connections is randomized. Using a connection jitter of 10% of the maximum connection lifetime smooths out the rate of new connections to a cluster. For example, for a maximum connection lifetime of 1800 seconds (30 minutes), set the connection jitter to 500 seconds (3 minutes).

Validate connections in a pool

After a connection pool initializes connections to CockroachDB clusters, those connections can occasionally break. This could be due to changes in the cluster topography, or rolling upgrades and restarts, or network disruptions.

Validating connections is typically handled automatically by the connection pool. For example, in HikariCP the connection is validated whenever you request a connection from the pool, and the keepaliveTime property allows you to configure an interval to periodically check if the connections in the pool are valid. Whatever connection pool you use, make sure connection validation is enabled when running your application.

Size connection pools

If your connection pool is properly configured, the total number of connections to your cluster will typically be at least 100 times larger than the number of new connections per second to your cluster.

Idle connections in CockroachDB do not consume many resources compared to PostgreSQL. Unlike PostgreSQL, which has a hard limit of 5000 connections, CockroachDB can safely support tens of thousands of connections.

The SQL Connection Attempts graph shows how many new connections are being created each second. The SQL Open Sessions graph shows the total number of SQL client connections across the cluster. To determine if your connection pool is correctly configured use the metrics from these graphs in the following formula:

SQL Connection Attempts < SQL Open Sessions/100

That is, divide the number of open SQL sessions by 100. This result should be less than the number of new SQL connections per second.

Configure the minimum number of connections to equal to the maximum number of connections, creating a fixed pool size.

Configure the maximum connection lifetime to 30 minutes.

Multi-region Serverless clusters have the same recommendations as single-region clusters.

Use the following formula to size the connection pool:

connections = (number of cores * 4)

If you have a large number of services connecting to the same cluster, make sure the number of concurrent active connections across all the services does not exceed this recommendation. If each service has its own connection pool, then you should make sure the sum of all the pool sizes is close to our maximum connections recommendation.

For multi-region clusters, create a connection pool per region, and size the maximum connection pool for each region in your cluster using the same formula as a single-region cluster.

For example, if you have 3 regions in your cluster, and each region has 12 vCPUs, create a connection pool for each region, with each connection pool having a maximum pool size of 48 (12 [processor cores] * 4).

To control the maximum number of non-superuser (root user or other admin role) connections a gateway node can have open at one time, use the server.max_connections_per_gateway cluster setting. If a new non-superuser connection would exceed this limit, the error message "sorry, too many clients already" is returned, along with error code 53300. This may be useful in addition to your connection pool settings.

Cockroach Labs strongly recommends that your application instance run in the same region as your CockroachDB cluster.

Monitor SQL connections

To validate that your connection pool is correctly configured, monitor the SQL connection metrics and graphs on your cluster.

Monitor new connections

The sql.new_conns metric and SQL Connection Rate graph expose the number of new SQL connections per second.

The SQL Connection Attempts graph shows the number of new SQL connections per second.

A misconfigured connection pool will result in most database operations requiring a new connection to be established, which will increase query latency.

Monitor active connections

The sql.conns metric and Open SQL Sessions graph show the number of open connections on your cluster or node.

The sql.statements.active metric and Active SQL Statements graph show the number of active connections on your cluster or node. A connection is "active" when it is actively executing a query.

Using the following formula:

connections = (number of cores * 4)

if the number of active connections exceeds 4 times the number of cores in your cluster your application is likely not achieving maximum throughput, and you should reduce the maximum number of connections in your connection pool, or scale the cluster by adding more nodes or cores per node.

Reducing the number of active connections may increase overall throughput, possibly at the expense of increased tail latency for your queries.

The SQL Open Sessions graph shows the number of open connections on your cluster.

If your connection pool is properly configured, the total number of open connections to your cluster should be at least 100 times larger than the number of new connections per second to your cluster.

Serverless functions

Note:

"Serverless" here refers to stateless, programmatic functions deployed in a cloud environment that provides an execution framework to provision resources dynamically, such as Amazon Lambda functions. It does not refer to Serverless clusters. Serverless functions can be used with CockroachDB Serverless, Dedicated, and Self-Hosted clusters.

If your application uses serverless functions to connect to CockroachDB, use a connection pool if you plan to invoke functions frequently. To ensure that the connection pool is reused across invocations of the same function instance, initialize the connection pool variable outside the scope of the serverless function definition. Set the maximum connection pool size to 1, unless your function is multi-threaded and establishes multiple concurrent requests to your database within a single function instance.

Example

In this example, a Java application similar to the basic JDBC example uses the PostgreSQL JDBC driver and HikariCP as the connection pool layer to connect to a CockroachDB cluster. The database is being run on 10 cores across the cluster.

Using the connection pool formula above:

connections = (10 [processor cores] * 4)

The connection pool size should be 40.

The maximum lifetime of a connection is set to 300000 milliseconds, or 5 minutes.

icon/buttons/copy
HikariConfig config = new HikariConfig();
config.setJdbcUrl("jdbc:postgresql://localhost:26257/bank");
config.setUsername("maxroach");
config.setPassword("password");
config.addDataSourceProperty("ssl", "true");
config.addDataSourceProperty("sslmode", "require");
config.addDataSourceProperty("reWriteBatchedInserts", "true");
config.setAutoCommit(false);
config.setMaximumPoolSize(40);
config.setMaxLifetime(300000);
config.setKeepaliveTime(150000);

HikariDataSource ds = new HikariDataSource(config);

Connection conn = ds.getConnection();

In this example, a Go application similar to the basic pgx example uses the pgxpool library to create a connection pool on a CockroachDB cluster. The database is being run on 10 cores across the cluster.

Using the connection pool formula above:

connections = (10 [processor cores] * 4)

The connection pool size should be 40, and is set using the pool_max_conns query parameter.

The pool_max_conn_lifetime query parameter sets the maximum age of a connection to 300 seconds, or 5 minutes.

The pool_max_conn_lifetime_jitter query parameter sets the connection jitter to 30 seconds, or 10% of the maximum connection age.

// Set connection pool configuration, with maximum connection pool size.
config, err := pgxpool.ParseConfig("postgres://max:roach@127.0.0.1:26257/bank?sslmode=require&pool_max_conns=40&pool_max_conn_lifetime=300s&pool_max_conn_lifetime_jitter=30s")
    if err != nil {
        log.Fatal("error configuring the database: ", err)
    }

// Create a connection pool to the "bank" database.
dbpool, err := pgxpool.ConnectConfig(context.Background(), config)
if err != nil {
    log.Fatal("error connecting to the database: ", err)
}
defer dbpool.Close()

For a full list of connection pool configuration parameters for pgxpool, see the pgxpool documentation.


Yes No
On this page

Yes No