Docs / Monitoring & Logging / Grafana Loki: Log Queries and Alerting

Grafana Loki: Log Queries and Alerting

By Admin · Mar 15, 2026 · Updated Apr 24, 2026 · 636 views · 3 min read

Grafana Loki is a log aggregation system designed to be cost-effective and easy to operate. Unlike Elasticsearch, Loki indexes only metadata (labels), not the full text of log messages, making it significantly cheaper to run while still supporting powerful query capabilities through LogQL. This guide covers LogQL query patterns and setting up log-based alerting.

Installation

# Docker Compose
services:
  loki:
    image: grafana/loki:2.9.4
    ports:
      - "3100:3100"
    volumes:
      - loki_data:/loki
    command: -config.file=/etc/loki/local-config.yaml

  promtail:
    image: grafana/promtail:2.9.4
    volumes:
      - /var/log:/var/log:ro
      - ./promtail-config.yml:/etc/promtail/config.yml
    command: -config.file=/etc/promtail/config.yml

volumes:
  loki_data:

LogQL Query Basics

# Stream selectors (filter by labels)
{job="nginx"}
{job="myapp", level="error"}
{job=~"api|web"}                     # Regex match
{job="myapp"} |= "timeout"          # Line contains "timeout"
{job="myapp"} != "healthcheck"       # Line does NOT contain
{job="myapp"} |~ "error|fatal"       # Regex line filter
{job="myapp"} !~ "debug|trace"       # Negative regex filter

Parsing and Filtering

# JSON log parsing
{job="myapp"} | json | level="error"
{job="myapp"} | json | status >= 500
{job="myapp"} | json | duration > 1000 | line_format "{{.method}} {{.path}} took {{.duration}}ms"

# Logfmt parsing
{job="myapp"} | logfmt | level="error" | duration > 5s

# Regex extraction
{job="nginx"} | regexp "(?P\d+\.\d+\.\d+\.\d+).*(?P\d{3})" | status="500"

# Pattern matching
{job="nginx"} | pattern " - - [] \"  \"  "
    | status >= 400

Metric Queries

# Count errors per minute
count_over_time({job="myapp"} |= "error" [1m])

# Error rate
sum(rate({job="myapp"} |= "error" [5m])) by (service)

# Request rate from access logs
sum(rate({job="nginx"} [5m]))

# P99 latency from parsed logs
quantile_over_time(0.99, {job="myapp"} | json | unwrap duration [5m]) by (endpoint)

# Top 10 error messages
topk(10, sum(count_over_time({job="myapp"} | json | level="error" [1h])) by (message))

Alerting with Loki

# Loki ruler configuration for alerting
# /etc/loki/rules/alerts.yml
groups:
  - name: application-alerts
    rules:
      - alert: HighErrorRate
        expr: |
          sum(rate({job="myapp"} |= "error" [5m])) > 10
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "High error rate detected"
          description: "More than 10 errors per second for 5 minutes"

      - alert: SlowRequests
        expr: |
          quantile_over_time(0.99, {job="myapp"} | json | unwrap duration [5m]) > 5000
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "P99 latency exceeds 5 seconds"

      - alert: OutOfMemoryErrors
        expr: |
          count_over_time({job="myapp"} |= "OutOfMemoryError" [15m]) > 0
        labels:
          severity: critical
        annotations:
          summary: "OOM error detected in application logs"

Grafana Dashboard Panels

# Log volume over time
sum(count_over_time({job="myapp"}[$__interval])) by (level)
# Visualization: time series, stacked bars

# Error log table
{job="myapp"} | json | level="error"
# Visualization: logs panel

# Request latency heatmap
{job="myapp"} | json | unwrap duration
# Visualization: heatmap

Promtail Configuration

# promtail-config.yml
server:
  http_listen_port: 9080

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://loki:3100/loki/api/v1/push

scrape_configs:
  - job_name: system
    static_configs:
      - targets: [localhost]
        labels:
          job: syslog
          __path__: /var/log/syslog

  - job_name: nginx
    static_configs:
      - targets: [localhost]
        labels:
          job: nginx
          __path__: /var/log/nginx/access.log

  - job_name: myapp
    static_configs:
      - targets: [localhost]
        labels:
          job: myapp
          __path__: /var/log/myapp/*.log
    pipeline_stages:
      - json:
          expressions:
            level: level
      - labels:
          level:

Best Practices

  • Design labels carefully — high-cardinality labels (user_id, request_id) should not be labels, use log parsing instead
  • Use structured logging (JSON) in applications for easier querying with | json
  • Keep label count low (5-10 labels max per stream) for optimal Loki performance
  • Use rate() and count_over_time() for alerting rather than raw log queries
  • Set retention policies to manage storage costs
  • Use Promtail pipeline stages to extract labels and structure data at ingestion time

Was this article helpful?