Docs / Monitoring & Logging / Building Custom Grafana Dashboards

Building Custom Grafana Dashboards

By Admin · Mar 15, 2026 · Updated Apr 23, 2026 · 379 views · 3 min read

Grafana dashboards transform raw metrics into actionable visual insights. This guide covers creating custom dashboards from scratch, including panel types, query patterns, variables, and layout techniques for building monitoring dashboards for VPS infrastructure, applications, and business metrics.

Dashboard Structure

# A good dashboard includes:
# 1. Overview row — key metrics at a glance (stat panels, gauges)
# 2. Resource utilization — CPU, memory, disk, network (time series)
# 3. Application metrics — request rate, error rate, latency (time series)
# 4. Logs — recent errors or events (log panel)
# 5. Alerts — current alert state (alert list panel)

Essential Panel Types

Stat Panel (Single Value)

# CPU Usage percentage
Query: 100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
Thresholds: 0=green, 70=yellow, 90=red
Unit: percent (0-100)

Time Series (Graphs)

# Memory usage over time
Query: node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes
Legend: {{instance}} - Used Memory
Unit: bytes (IEC)

# Network traffic
Query A: rate(node_network_receive_bytes_total{device="eth0"}[5m]) * 8
Query B: rate(node_network_transmit_bytes_total{device="eth0"}[5m]) * 8
Legend: Inbound / Outbound
Unit: bits/sec

Gauge Panel

# Disk usage percentage
Query: (node_filesystem_size_bytes{mountpoint="/"} - node_filesystem_avail_bytes{mountpoint="/"}) / node_filesystem_size_bytes{mountpoint="/"} * 100
Min: 0, Max: 100
Thresholds: 0=green, 75=yellow, 90=red

Table Panel

# Top 10 containers by CPU
Query: topk(10, sum(rate(container_cpu_usage_seconds_total[5m])) by (name))
Format: Table
Transform: Organize fields, rename columns

Template Variables

# Create dynamic dashboards with variables

# Instance selector
Name: instance
Type: Query
Query: label_values(node_cpu_seconds_total, instance)

# Use in queries:
100 - (avg(rate(node_cpu_seconds_total{instance="$instance", mode="idle"}[5m])) * 100)

# Job selector
Name: job
Type: Query
Query: label_values(up, job)

# Interval variable (for rate calculations)
Name: interval
Type: Interval
Values: 1m, 5m, 15m, 1h

Row Organization

# Organize panels into collapsible rows:

Row 1: "Overview" (collapsed: false)
  - Stat: Uptime
  - Stat: CPU Usage
  - Stat: Memory Usage
  - Stat: Disk Usage
  - Stat: Active Connections

Row 2: "CPU & Memory" (collapsed: true)
  - Time Series: CPU by mode (user, system, iowait)
  - Time Series: Memory breakdown (used, cached, buffered)
  - Time Series: Load average (1m, 5m, 15m)

Row 3: "Disk & Network" (collapsed: true)
  - Time Series: Disk I/O (reads/writes per second)
  - Time Series: Disk space over time
  - Time Series: Network traffic in/out

Row 4: "Application" (collapsed: true)
  - Time Series: HTTP request rate
  - Time Series: Error rate (4xx, 5xx)
  - Heatmap: Response time distribution

Annotations

# Mark deployments, incidents, and changes on all graphs
# Dashboard Settings → Annotations → Add

# Deployment markers from Grafana annotations API:
curl -X POST http://grafana:3000/api/annotations \
    -H "Authorization: Bearer your-api-key" \
    -H "Content-Type: application/json" \
    -d '{"text": "Deployed v2.1.0", "tags": ["deploy"]}'

Dashboard as Code (JSON/Terraform)

# Export dashboard as JSON
# Dashboard → Share → Export → Save to file

# Import via API
curl -X POST http://grafana:3000/api/dashboards/db \
    -H "Authorization: Bearer api-key" \
    -H "Content-Type: application/json" \
    -d @dashboard.json

# Provisioning from files (auto-load on startup)
# /etc/grafana/provisioning/dashboards/default.yaml
apiVersion: 1
providers:
  - name: default
    folder: ""
    type: file
    options:
      path: /var/lib/grafana/dashboards
      foldersFromFilesStructure: true

Best Practices

  • Start with overview metrics at the top — users should see system health at a glance
  • Use consistent color schemes: green=healthy, yellow=warning, red=critical
  • Set meaningful units on all panels (bytes, percent, requests/sec)
  • Use template variables for multi-server dashboards
  • Keep dashboards focused — one per service or system, not everything on one page
  • Version control dashboard JSON for change tracking and rollback
  • Add annotations for deployments to correlate changes with metric shifts

Was this article helpful?