Thanos extends Prometheus with unlimited long-term storage, global query view across multiple Prometheus instances, and high availability through data deduplication. It uses object storage (S3, GCS, Azure Blob) for cost-effective multi-year metrics retention. This guide covers deploying Thanos for production Prometheus infrastructure.
Thanos Architecture
- Sidecar — runs alongside Prometheus, uploads data to object storage
- Store Gateway — serves historical data from object storage
- Querier — unified PromQL query interface across all data sources
- Compactor — downsamples and compacts old data for efficiency
- Ruler — evaluates recording and alerting rules across the global view
Sidecar Deployment
# Run alongside each Prometheus instance
services:
prometheus:
image: prom/prometheus:v2.50.0
volumes:
- prometheus_data:/prometheus
- ./prometheus.yml:/etc/prometheus/prometheus.yml
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.min-block-duration=2h"
- "--storage.tsdb.max-block-duration=2h" # Required for Thanos
thanos-sidecar:
image: thanosio/thanos:v0.34.1
command:
- "sidecar"
- "--tsdb.path=/prometheus"
- "--prometheus.url=http://prometheus:9090"
- "--objstore.config-file=/etc/thanos/objstore.yml"
- "--grpc-address=0.0.0.0:10901"
volumes:
- prometheus_data:/prometheus
- ./objstore.yml:/etc/thanos/objstore.yml
Object Storage Configuration
# objstore.yml — S3 compatible
type: S3
config:
bucket: thanos-metrics
endpoint: s3.amazonaws.com
region: us-east-1
access_key: YOUR_ACCESS_KEY
secret_key: YOUR_SECRET_KEY
# MinIO (self-hosted S3)
type: S3
config:
bucket: thanos-metrics
endpoint: minio:9000
access_key: minio_access_key
secret_key: minio_secret_key
insecure: true
# GCS
type: GCS
config:
bucket: thanos-metrics
Store Gateway
thanos-store:
image: thanosio/thanos:v0.34.1
command:
- "store"
- "--data-dir=/data/store"
- "--objstore.config-file=/etc/thanos/objstore.yml"
- "--grpc-address=0.0.0.0:10901"
volumes:
- store_data:/data/store
- ./objstore.yml:/etc/thanos/objstore.yml
Querier (Global View)
thanos-query:
image: thanosio/thanos:v0.34.1
command:
- "query"
- "--store=thanos-sidecar-1:10901" # Live data from Prometheus 1
- "--store=thanos-sidecar-2:10901" # Live data from Prometheus 2
- "--store=thanos-store:10901" # Historical data from object storage
- "--query.auto-downsampling"
ports:
- "10902:10902" # Query UI and API
Compactor
thanos-compact:
image: thanosio/thanos:v0.34.1
command:
- "compact"
- "--data-dir=/data/compact"
- "--objstore.config-file=/etc/thanos/objstore.yml"
- "--retention.resolution-raw=30d" # Keep raw data for 30 days
- "--retention.resolution-5m=180d" # Keep 5m downsampled for 6 months
- "--retention.resolution-1h=365d" # Keep 1h downsampled for 1 year
- "--wait"
volumes:
- compact_data:/data/compact
- ./objstore.yml:/etc/thanos/objstore.yml
Grafana Configuration
# Add Thanos Querier as a Prometheus data source in Grafana
# URL: http://thanos-query:10902
# This provides access to ALL metrics from ALL Prometheus instances
# and historical data from object storage
High Availability
# Run two Prometheus instances scraping the same targets
# Thanos Querier deduplicates data automatically
# prometheus-1.yml
global:
external_labels:
replica: prometheus-1
# prometheus-2.yml
global:
external_labels:
replica: prometheus-2
# Thanos Querier deduplicates based on the replica label
thanos-query --store=sidecar-1:10901 --store=sidecar-2:10901 --query.replica-label=replica
Best Practices
- Set Prometheus
min-block-durationandmax-block-durationto 2h for Thanos compatibility - Use the Compactor to downsample old data — 5-minute and 1-hour resolutions save significant storage
- Run the Compactor as a single instance (it uses locking, not designed for HA)
- Use S3-compatible object storage for cost-effective long-term retention
- Configure retention based on resolution: raw (30d), 5m (6mo), 1h (1y+)
- Use
--query.auto-downsamplingin Querier for automatic resolution selection based on time range