How to Configure Alerting with Grafana
Grafana's unified alerting system lets you define alert rules against any data source, manage notification routing, and track alert history. Setting up effective alerting on your Breeze server ensures you catch issues before they affect users.
Prerequisites
- Grafana 9.0+ installed on your Breeze instance
- At least one data source configured (Prometheus, Loki, MySQL, etc.)
- A notification channel configured (email, Slack, webhook, etc.)
Understanding Grafana Alerting Components
- Alert Rules — conditions that define when an alert should fire
- Contact Points — where notifications are sent (email, Slack, PagerDuty, etc.)
- Notification Policies — routing rules that determine which contact points receive which alerts
- Silences — temporary muting of specific alerts during maintenance
Step 1: Configure a Contact Point
Navigate to Alerting → Contact points → New contact point:
- Name: ops-team-email
- Type: Email
- Addresses: ops@yourdomain.com
For Slack notifications:
- Name: ops-slack
- Type: Slack
- Webhook URL: your Slack incoming webhook URL
- Channel: #alerts
Step 2: Create an Alert Rule
Navigate to Alerting → Alert rules → New alert rule:
Example: High CPU Usage Alert
- Rule name: High CPU Usage
- Query: Using Prometheus data source
# Query A: Average CPU usage over 5 minutes
100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
# Condition: Is above 85%
# Evaluate every: 1m
# For: 5m (must be above threshold for 5 minutes to fire)
Example: Disk Space Alert
# Query: Disk usage percentage
(1 - node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100
# Condition: Is above 90%
# For: 10m
Example: Application Error Rate
# Query: HTTP 5xx error rate
rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) * 100
# Condition: Is above 5%
# For: 2m
Step 3: Set Up Notification Policies
Navigate to Alerting → Notification policies:
- Set the Default contact point to your primary notification channel
- Add child policies to route specific alerts to different teams:
Root Policy → ops-team-email (default)
├── severity=critical → ops-pagerduty (immediate)
├── severity=warning → ops-slack (#alerts channel)
└── team=database → dba-team-email
Step 4: Add Labels and Annotations
Enhance alerts with meaningful context:
- Labels are used for routing:
severity=critical,team=infrastructure - Annotations provide context in notifications:
Summary: CPU usage on {{ $labels.instance }} is {{ $values.A }}%
Description: Server {{ $labels.instance }} has had CPU usage above 85% for over 5 minutes.
Investigate running processes and consider scaling resources.
Runbook URL: https://wiki.internal/runbooks/high-cpu
Step 5: Testing and Silencing
Test your alert pipeline:
- Use the Test button on contact points to verify notifications
- Create a test alert rule with a condition that is always true
- Check the Alert rules page for current alert states
Create silences for planned maintenance:
- Navigate to Alerting → Silences → New silence
- Match specific labels (e.g.,
instance=breeze-maintenance-server) - Set a duration matching your maintenance window
Best Practices
- Use the
forduration to avoid alerting on brief spikes - Set escalating severity levels: info → warning → critical
- Include runbook URLs in annotations so responders know what to do
- Group related alerts to reduce notification noise
- Review and tune alert thresholds regularly based on your Breeze server baselines
- Create a dedicated Grafana dashboard for alert overview and history