Docs / Databases / Deploying a CockroachDB Multi-Node Cluster

Deploying a CockroachDB Multi-Node Cluster

By Admin · Mar 15, 2026 · Updated Apr 23, 2026 · 443 views · 4 min read

CockroachDB is a distributed SQL database that provides horizontal scaling, strong consistency, and automatic failover with PostgreSQL-compatible wire protocol. This guide covers deploying a multi-node CockroachDB cluster for production workloads.

Why CockroachDB?

  • Distributed SQL — ACID transactions across multiple nodes and regions
  • PostgreSQL compatible — use standard PostgreSQL drivers and tools
  • Automatic sharding — data distributes across nodes without manual configuration
  • Survivability — tolerates node, rack, or region failures without data loss
  • Online schema changes — ALTER TABLE without locks or downtime

Prerequisites

  • 3+ VPS instances (minimum for production quorum)
  • 4+ CPU cores and 8+ GB RAM per node
  • SSD storage (NVMe preferred)
  • Network connectivity between all nodes (ports 26257 and 8080)
  • Synchronized clocks via NTP (critical for CockroachDB)

Installation

# Download CockroachDB on each node
curl https://binaries.cockroachdb.com/cockroach-v24.2.0.linux-amd64.tgz | tar xz
sudo cp cockroach-v24.2.0.linux-amd64/cockroach /usr/local/bin/
sudo mkdir -p /usr/local/lib/cockroach
sudo cp cockroach-v24.2.0.linux-amd64/lib/* /usr/local/lib/cockroach/

Generate Certificates

# Create CA certificate
cockroach cert create-ca --certs-dir=certs --ca-key=ca.key

# Create node certificates (for each node)
cockroach cert create-node node1 node1.example.com localhost 127.0.0.1 --certs-dir=certs --ca-key=ca.key
cockroach cert create-node node2 node2.example.com localhost 127.0.0.1 --certs-dir=certs --ca-key=ca.key
cockroach cert create-node node3 node3.example.com localhost 127.0.0.1 --certs-dir=certs --ca-key=ca.key

# Create client certificate for root user
cockroach cert create-client root --certs-dir=certs --ca-key=ca.key

# Copy appropriate certs to each node
scp certs/ca.crt certs/node.crt certs/node.key node2:/etc/cockroach/certs/

Start the Cluster

# Node 1 (first node to start)
cockroach start \
    --certs-dir=/etc/cockroach/certs \
    --advertise-addr=node1:26257 \
    --join=node1:26257,node2:26257,node3:26257 \
    --store=path=/data/cockroach,attrs=ssd \
    --locality=region=us-east,zone=us-east-1 \
    --cache=.25 \
    --max-sql-memory=.25 \
    --background

# Node 2
cockroach start \
    --certs-dir=/etc/cockroach/certs \
    --advertise-addr=node2:26257 \
    --join=node1:26257,node2:26257,node3:26257 \
    --store=path=/data/cockroach,attrs=ssd \
    --locality=region=us-east,zone=us-east-2 \
    --cache=.25 \
    --max-sql-memory=.25 \
    --background

# Node 3
cockroach start \
    --certs-dir=/etc/cockroach/certs \
    --advertise-addr=node3:26257 \
    --join=node1:26257,node2:26257,node3:26257 \
    --store=path=/data/cockroach,attrs=ssd \
    --locality=region=us-west,zone=us-west-1 \
    --cache=.25 \
    --max-sql-memory=.25 \
    --background

# Initialize the cluster (one-time, from any node)
cockroach init --certs-dir=/etc/cockroach/certs --host=node1:26257

Verify Cluster Health

# Check node status
cockroach node status --certs-dir=/etc/cockroach/certs --host=node1:26257

# Access the DB Console (web UI)
# Open https://node1:8080 in your browser

# Connect with SQL client
cockroach sql --certs-dir=/etc/cockroach/certs --host=node1:26257

> SHOW NODES;
> SELECT * FROM crdb_internal.gossip_nodes;

Create Database and Users

-- Connect to the cluster
cockroach sql --certs-dir=certs --host=node1:26257

-- Create database
CREATE DATABASE myapp;

-- Create application user
CREATE USER app_user WITH PASSWORD 'AppPassword123!';
GRANT ALL ON DATABASE myapp TO app_user;

-- Create read-only user
CREATE USER readonly_user WITH PASSWORD 'ReadOnlyPass!';
GRANT SELECT ON DATABASE myapp TO readonly_user;

Schema Design for Distribution

-- Use UUID primary keys for even distribution
CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email STRING UNIQUE NOT NULL,
    name STRING NOT NULL,
    created_at TIMESTAMPTZ DEFAULT now()
);

-- Multi-region table pinning
ALTER TABLE users SET LOCALITY REGIONAL BY ROW;

-- Hash-sharded index for sequential write performance
CREATE TABLE events (
    id UUID DEFAULT gen_random_uuid(),
    ts TIMESTAMPTZ NOT NULL DEFAULT now(),
    data JSONB,
    PRIMARY KEY (id) USING HASH
);

Connecting Applications

# Use standard PostgreSQL connection strings
# Include all nodes for failover
postgresql://app_user:password@node1:26257,node2:26257,node3:26257/myapp?sslmode=verify-full&sslrootcert=ca.crt

# PHP PDO
$pdo = new PDO(
    'pgsql:host=node1;port=26257;dbname=myapp;sslmode=verify-full;sslrootcert=/path/to/ca.crt',
    'app_user', 'password'
);

Monitoring

# Built-in DB Console at https://node:8080
# Shows: SQL activity, replication, storage, network, hot ranges

# Prometheus metrics endpoint
# https://node:8080/_status/vars

# Key metrics:
# - cr.node.sql.query.count
# - cr.node.liveness.livenodes
# - cr.store.capacity.used
# - cr.node.sql.service.latency-p99

Backup and Restore

-- Full backup to cloud storage
BACKUP DATABASE myapp INTO 's3://my-bucket/backups?AUTH=implicit'
    WITH revision_history;

-- Scheduled automatic backups
CREATE SCHEDULE daily_backup
    FOR BACKUP DATABASE myapp INTO 's3://my-bucket/backups?AUTH=implicit'
    RECURRING '@daily'
    WITH SCHEDULE OPTIONS first_run = 'now';

-- Restore
RESTORE DATABASE myapp FROM LATEST IN 's3://my-bucket/backups?AUTH=implicit';

Production Best Practices

  • Use 3, 5, or 7 nodes — odd numbers ensure clean quorum decisions
  • Ensure NTP is configured and clocks are synchronized within 250ms
  • Use UUID primary keys instead of sequential integers for even data distribution
  • Set --locality flags to enable zone/region-aware data placement
  • Use a load balancer (HAProxy, Nginx) in front of CockroachDB for connection distribution
  • Monitor the DB Console for hot ranges and rebalancing activity
  • Schedule regular backups to cloud storage with revision history
  • Test node failure and recovery procedures before production deployment

Was this article helpful?