Architecture
Your Server → Node Exporter → Prometheus → Grafana
(metrics) (storage) (dashboards)
Install Node Exporter
Node Exporter exposes system metrics (CPU, memory, disk, network):
# Download
wget https://github.com/prometheus/node_exporter/releases/latest/download/node_exporter-1.7.0.linux-amd64.tar.gz
tar xzf node_exporter-*.tar.gz
sudo mv node_exporter-*/node_exporter /usr/local/bin/
# Create systemd service
sudo tee /etc/systemd/system/node_exporter.service << 'EOF'
[Unit]
Description=Node Exporter
After=network.target
[Service]
User=node_exporter
ExecStart=/usr/local/bin/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
sudo useradd -rs /bin/false node_exporter
sudo systemctl enable --now node_exporter
Metrics available at http://localhost:9100/metrics.
Install Prometheus
# Download
wget https://github.com/prometheus/prometheus/releases/latest/download/prometheus-2.50.0.linux-amd64.tar.gz
tar xzf prometheus-*.tar.gz
sudo mv prometheus-*/prometheus /usr/local/bin/
sudo mv prometheus-*/promtool /usr/local/bin/
sudo mkdir -p /etc/prometheus /var/lib/prometheus
Configuration
# /etc/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']
labels:
instance: 'web-server-1'
- job_name: 'nginx'
static_configs:
- targets: ['localhost:9113']
Install Grafana
sudo apt install -y apt-transport-https software-properties-common
wget -q -O - https://apt.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://apt.grafana.com stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update && sudo apt install -y grafana
sudo systemctl enable --now grafana-server
Access at http://your-server:3000 (default: admin/admin).
Essential Dashboard Panels
| Metric | PromQL |
|---|---|
| CPU usage | 100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) |
| Memory usage | (1 - node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 |
| Disk usage | 100 - (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"} * 100) |
| Network in | rate(node_network_receive_bytes_total{device="eth0"}[5m]) * 8 |
| Network out | rate(node_network_transmit_bytes_total{device="eth0"}[5m]) * 8 |
| Load average | node_load1 |
Alerting Rules
# /etc/prometheus/rules/alerts.yml
groups:
- name: server_alerts
rules:
- alert: HighCPU
expr: 100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage on {{ $labels.instance }}"
- alert: DiskSpaceLow
expr: node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"} * 100 < 15
for: 5m
labels:
severity: critical
Tip Import community dashboards from grafana.com/dashboards. Dashboard ID 1860 ("Node Exporter Full") is an excellent starting point.