Docs / Monitoring & Logging / Thanos: Long-Term Prometheus Storage and HA

Thanos: Long-Term Prometheus Storage and HA

By Admin · Mar 15, 2026 · Updated Apr 23, 2026 · 312 views · 3 min read

Thanos extends Prometheus with unlimited long-term storage, global query view across multiple Prometheus instances, and high availability through data deduplication. It uses object storage (S3, GCS, Azure Blob) for cost-effective multi-year metrics retention. This guide covers deploying Thanos for production Prometheus infrastructure.

Thanos Architecture

  • Sidecar — runs alongside Prometheus, uploads data to object storage
  • Store Gateway — serves historical data from object storage
  • Querier — unified PromQL query interface across all data sources
  • Compactor — downsamples and compacts old data for efficiency
  • Ruler — evaluates recording and alerting rules across the global view

Sidecar Deployment

# Run alongside each Prometheus instance
services:
  prometheus:
    image: prom/prometheus:v2.50.0
    volumes:
      - prometheus_data:/prometheus
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    command:
      - "--config.file=/etc/prometheus/prometheus.yml"
      - "--storage.tsdb.path=/prometheus"
      - "--storage.tsdb.min-block-duration=2h"
      - "--storage.tsdb.max-block-duration=2h"  # Required for Thanos

  thanos-sidecar:
    image: thanosio/thanos:v0.34.1
    command:
      - "sidecar"
      - "--tsdb.path=/prometheus"
      - "--prometheus.url=http://prometheus:9090"
      - "--objstore.config-file=/etc/thanos/objstore.yml"
      - "--grpc-address=0.0.0.0:10901"
    volumes:
      - prometheus_data:/prometheus
      - ./objstore.yml:/etc/thanos/objstore.yml

Object Storage Configuration

# objstore.yml — S3 compatible
type: S3
config:
  bucket: thanos-metrics
  endpoint: s3.amazonaws.com
  region: us-east-1
  access_key: YOUR_ACCESS_KEY
  secret_key: YOUR_SECRET_KEY

# MinIO (self-hosted S3)
type: S3
config:
  bucket: thanos-metrics
  endpoint: minio:9000
  access_key: minio_access_key
  secret_key: minio_secret_key
  insecure: true

# GCS
type: GCS
config:
  bucket: thanos-metrics

Store Gateway

  thanos-store:
    image: thanosio/thanos:v0.34.1
    command:
      - "store"
      - "--data-dir=/data/store"
      - "--objstore.config-file=/etc/thanos/objstore.yml"
      - "--grpc-address=0.0.0.0:10901"
    volumes:
      - store_data:/data/store
      - ./objstore.yml:/etc/thanos/objstore.yml

Querier (Global View)

  thanos-query:
    image: thanosio/thanos:v0.34.1
    command:
      - "query"
      - "--store=thanos-sidecar-1:10901"   # Live data from Prometheus 1
      - "--store=thanos-sidecar-2:10901"   # Live data from Prometheus 2
      - "--store=thanos-store:10901"        # Historical data from object storage
      - "--query.auto-downsampling"
    ports:
      - "10902:10902"    # Query UI and API

Compactor

  thanos-compact:
    image: thanosio/thanos:v0.34.1
    command:
      - "compact"
      - "--data-dir=/data/compact"
      - "--objstore.config-file=/etc/thanos/objstore.yml"
      - "--retention.resolution-raw=30d"     # Keep raw data for 30 days
      - "--retention.resolution-5m=180d"     # Keep 5m downsampled for 6 months
      - "--retention.resolution-1h=365d"     # Keep 1h downsampled for 1 year
      - "--wait"
    volumes:
      - compact_data:/data/compact
      - ./objstore.yml:/etc/thanos/objstore.yml

Grafana Configuration

# Add Thanos Querier as a Prometheus data source in Grafana
# URL: http://thanos-query:10902
# This provides access to ALL metrics from ALL Prometheus instances
# and historical data from object storage

High Availability

# Run two Prometheus instances scraping the same targets
# Thanos Querier deduplicates data automatically

# prometheus-1.yml
global:
  external_labels:
    replica: prometheus-1

# prometheus-2.yml
global:
  external_labels:
    replica: prometheus-2

# Thanos Querier deduplicates based on the replica label
thanos-query --store=sidecar-1:10901 --store=sidecar-2:10901 --query.replica-label=replica

Best Practices

  • Set Prometheus min-block-duration and max-block-duration to 2h for Thanos compatibility
  • Use the Compactor to downsample old data — 5-minute and 1-hour resolutions save significant storage
  • Run the Compactor as a single instance (it uses locking, not designed for HA)
  • Use S3-compatible object storage for cost-effective long-term retention
  • Configure retention based on resolution: raw (30d), 5m (6mo), 1h (1y+)
  • Use --query.auto-downsampling in Querier for automatic resolution selection based on time range

Was this article helpful?