Docs / Monitoring & Logging / Monitor Disk I/O with iostat and iotop

Monitor Disk I/O with iostat and iotop

By Admin · Mar 15, 2026 · Updated Apr 24, 2026 · 589 views · 4 min read

Disk I/O bottlenecks are among the most common causes of server performance problems. Tools like iostat and iotop provide real-time and historical views of disk activity, helping you identify which processes and devices are causing I/O contention. This guide covers practical I/O monitoring and troubleshooting techniques.

iostat: Device-Level I/O Statistics

# Install
sudo apt install sysstat    # Ubuntu/Debian
sudo dnf install sysstat    # Rocky Linux

# Basic usage (device stats every 2 seconds, 5 times)
iostat -xz 2 5

# Key columns:
# r/s      — reads per second
# w/s      — writes per second
# rkB/s    — kilobytes read per second
# wkB/s    — kilobytes written per second
# await    — average I/O wait time (ms) — MOST IMPORTANT
# %util    — device utilization percentage

Reading iostat Output

$ iostat -xz 2

Device  r/s    w/s   rkB/s   wkB/s  rrqm/s wrqm/s await r_await w_await %util
sda     50.0  200.0  400.0   3200.0   0.0   150.0   4.0    2.0     4.5   45.0
nvme0   150.0 500.0  2400.0  8000.0   0.0    50.0   0.5    0.3     0.6   15.0

# Interpretation:
# sda: HDD with 45% utilization, 4ms average wait — reasonable
# nvme0: NVMe with 15% utilization, 0.5ms wait — healthy

# WARNING signs:
# await > 20ms on SSD — possible I/O bottleneck
# await > 100ms on HDD — severe I/O bottleneck
# %util > 80% — device is heavily loaded
# %util = 100% — device is saturated

iotop: Process-Level I/O Monitoring

# Install
sudo apt install iotop-c    # Ubuntu (iotop-c is the modern version)
sudo dnf install iotop      # Rocky Linux

# Show only processes doing I/O
sudo iotop -o

# Batch mode (for scripting/logging)
sudo iotop -b -o -n 5 -d 2    # 5 iterations, 2 second delay

# Accumulate I/O (total since start)
sudo iotop -a

Understanding iotop Output

$ sudo iotop -o

Total DISK READ: 15.2 MB/s | Total DISK WRITE: 45.8 MB/s
  PID  USER     DISK READ  DISK WRITE  COMMAND
 1234  mysql     12.0 MB/s   35.0 MB/s  mysqld --datadir=/var/lib/mysql
 5678  www-data   3.0 MB/s    8.0 MB/s  php-fpm: pool www
  910  root       0.2 MB/s    2.8 MB/s  rsync --archive /backup

# MySQL is the primary I/O consumer — investigate slow queries

Advanced iostat Monitoring

# Per-partition statistics
iostat -xp 2

# JSON output (for automation)
iostat -xz -o JSON 2 1

# Monitor specific device
iostat -xz -d nvme0n1 2

# Historical stats (if sar/sysstat data collection is enabled)
# Enable data collection
sudo systemctl enable --now sysstat

# View historical I/O stats
sar -d -f /var/log/sysstat/sa15    # Day 15 disk stats
sar -d -s 14:00:00 -e 15:00:00    # Today, 2-3 PM

Common I/O Bottleneck Patterns

Database I/O

# Symptoms: high r/s and w/s on data disk, high await
# Diagnosis: check slow query log
sudo iotop -o    # Confirm mysqld/postgres is the top I/O consumer

# Solutions:
# - Increase InnoDB buffer pool / shared_buffers
# - Add missing indexes (reduce disk reads)
# - Move WAL/binlog to separate disk
# - Upgrade to NVMe storage

Log Rotation I/O Spike

# Symptoms: periodic high writes, often at midnight
# Check with: sar -d -s 00:00:00 -e 00:30:00
# Solution: use logrotate with copytruncate or compress

Swap Thrashing

# Symptoms: high I/O on swap partition/file
iostat -xz 2 | grep -E "swap|dm-"
# Solution: add more RAM or reduce memory usage

Prometheus Monitoring

# node_exporter provides disk I/O metrics
# Key metrics:
rate(node_disk_read_bytes_total[5m])        # Read throughput
rate(node_disk_written_bytes_total[5m])     # Write throughput
rate(node_disk_io_time_seconds_total[5m])   # I/O utilization
rate(node_disk_read_time_seconds_total[5m]) / rate(node_disk_reads_completed_total[5m])  # Avg read latency

Best Practices

  • Monitor await (I/O latency) as the primary indicator — high await means I/O bottleneck regardless of throughput
  • Use iotop to identify which processes are causing high I/O
  • Enable sysstat data collection for historical I/O analysis
  • Separate database data, logs, and WAL onto different disks when possible
  • Use NVMe storage for I/O-intensive workloads — the latency difference is 10-100x vs HDD
  • Alert on node_disk_io_time_seconds_total utilization > 80% in Prometheus
  • Check for swap I/O as a sign of memory pressure

Was this article helpful?