Vitess is a database clustering system for horizontal scaling of MySQL, originally built at YouTube to handle massive traffic. It provides sharding, connection pooling, and query routing while maintaining MySQL compatibility. This guide covers deploying Vitess for production MySQL scaling.
Why Vitess?
Vitess solves several MySQL scaling challenges:
- Horizontal sharding — automatically split data across multiple MySQL instances
- Connection pooling — multiplexes thousands of application connections into a small number of MySQL connections
- Query protection — prevents poorly-written queries from overwhelming the database
- Online schema changes — apply DDL without locking tables
- Topology management — handles failover and replication automatically
Vitess Architecture
Key Vitess components:
- VTGate — the query router; applications connect here instead of MySQL directly
- VTTablet — runs alongside each MySQL instance, managing it and serving queries
- Topology Service — stores cluster metadata (uses etcd, ZooKeeper, or Consul)
- VTCtld — cluster management daemon and web interface
- VTOrc — automated failover orchestrator
Installation with Docker Compose
# Clone Vitess
git clone https://github.com/vitessio/vitess.git
cd vitess/examples/compose
# Start a local cluster with 2 shards
docker compose up -d
# This creates:
# - 1 VTGate (port 15991 for MySQL protocol)
# - 2 shards, each with 1 primary + 1 replica
# - etcd for topology
# - VTCtld with web UI (port 15000)
Production Deployment with Kubernetes
Vitess is designed for Kubernetes. Use the Vitess Operator for production deployments:
# Install the Vitess operator
kubectl apply -f https://github.com/planetscale/vitess-operator/releases/latest/download/operator.yaml
# Create a VitessCluster resource
cat <<EOF | kubectl apply -f -
apiVersion: planetscale.com/v2
kind: VitessCluster
metadata:
name: production
spec:
images:
vtgate: vitess/lite:v19
vttablet: vitess/lite:v19
vtbackup: vitess/lite:v19
vtctld: vitess/lite:v19
vtorc: vitess/lite:v19
cells:
- name: zone1
gateway:
replicas: 2
resources:
requests:
cpu: "2"
memory: "4Gi"
keyspaces:
- name: commerce
turndownPolicy: Immediate
partitionings:
- equal:
parts: 2
shardTemplate:
databaseInitScriptSecret:
name: commerce-schema
key: init_db.sql
tabletPools:
- cell: zone1
type: replica
replicas: 3
mysqld:
resources:
requests:
cpu: "4"
memory: "8Gi"
dataVolumeClaimTemplate:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Gi
EOF
Creating a Keyspace and Schema
# A keyspace is the Vitess equivalent of a database
# Connect to VTGate
mysql -h vtgate-host -P 15991 -u user
# Create schema through vtctldclient
vtctldclient ApplySchema --sql="
CREATE TABLE customers (
id BIGINT NOT NULL AUTO_INCREMENT,
email VARCHAR(255) NOT NULL,
name VARCHAR(255),
PRIMARY KEY (id)
) ENGINE=InnoDB;
CREATE TABLE orders (
id BIGINT NOT NULL AUTO_INCREMENT,
customer_id BIGINT NOT NULL,
total DECIMAL(10,2),
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (id)
) ENGINE=InnoDB;
" commerce
Sharding Strategy with VSchema
Vitess uses a VSchema to define how data is sharded:
// VSchema for the commerce keyspace
{
"sharded": true,
"vindexes": {
"hash": {
"type": "hash"
},
"customer_lookup": {
"type": "consistent_lookup",
"params": {
"table": "customer_lookup",
"from": "email",
"to": "customer_id"
}
}
},
"tables": {
"customers": {
"column_vindexes": [
{
"column": "id",
"name": "hash"
}
]
},
"orders": {
"column_vindexes": [
{
"column": "customer_id",
"name": "hash"
}
]
}
}
}
# Apply VSchema
vtctldclient ApplyVSchema --vschema-file=commerce_vschema.json commerce
Performing a Reshard
# Split 2 shards into 4
vtctldclient Reshard --workflow commerce2x4 --target-keyspace commerce create --source-shards='-80,80-' --target-shards='-40,40-80,80-c0,c0-'
# Monitor progress
vtctldclient Reshard --workflow commerce2x4 --target-keyspace commerce show
# Switch reads then writes
vtctldclient Reshard --workflow commerce2x4 --target-keyspace commerce switchtraffic --tablet-types=rdonly,replica
vtctldclient Reshard --workflow commerce2x4 --target-keyspace commerce switchtraffic --tablet-types=primary
# Complete the reshard
vtctldclient Reshard --workflow commerce2x4 --target-keyspace commerce complete
Online Schema Changes
# Vitess uses Online DDL (gh-ost or pt-osc under the hood)
vtctldclient ApplySchema --sql="ALTER TABLE orders ADD COLUMN status VARCHAR(50) DEFAULT 'pending'" --ddl-strategy="vitess" commerce
# Check migration status
vtctldclient OnlineDDL show commerce all
Connecting Applications
# Applications connect to VTGate — it looks like a normal MySQL server
# PHP
$pdo = new PDO('mysql:host=vtgate-host;port=15991;dbname=commerce', 'app_user', 'password');
# Node.js
const connection = mysql.createConnection({
host: 'vtgate-host',
port: 15991,
database: 'commerce',
user: 'app_user',
password: 'password'
});
Monitoring Vitess
# VTGate exposes Prometheus metrics on /debug/vars
# Key metrics to monitor:
# - vtgate_queries_processed_total — query throughput
# - vtgate_error_counts — error rates
# - vttablet_query_counts — per-shard query distribution
# - vttablet_replication_lag_seconds — replica lag
# VTCtld web UI provides cluster overview
# Access at http://vtctld-host:15000
Production Best Practices
- Start with 2 shards and plan for growth — Vitess makes adding shards straightforward
- Choose shard keys that distribute data evenly and align with your query patterns
- Deploy multiple VTGate instances behind a load balancer for high availability
- Use VTOrc for automated primary failover within each shard
- Test resharding in staging before production — it is a complex operation even when automated
- Monitor tablet health and replication lag across all shards
- Use Online DDL for all schema changes to avoid table locks