Disk I/O bottlenecks are among the most common causes of server performance problems. Tools like iostat and iotop provide real-time and historical views of disk activity, helping you identify which processes and devices are causing I/O contention. This guide covers practical I/O monitoring and troubleshooting techniques.
iostat: Device-Level I/O Statistics
# Install
sudo apt install sysstat # Ubuntu/Debian
sudo dnf install sysstat # Rocky Linux
# Basic usage (device stats every 2 seconds, 5 times)
iostat -xz 2 5
# Key columns:
# r/s — reads per second
# w/s — writes per second
# rkB/s — kilobytes read per second
# wkB/s — kilobytes written per second
# await — average I/O wait time (ms) — MOST IMPORTANT
# %util — device utilization percentage
Reading iostat Output
$ iostat -xz 2
Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s await r_await w_await %util
sda 50.0 200.0 400.0 3200.0 0.0 150.0 4.0 2.0 4.5 45.0
nvme0 150.0 500.0 2400.0 8000.0 0.0 50.0 0.5 0.3 0.6 15.0
# Interpretation:
# sda: HDD with 45% utilization, 4ms average wait — reasonable
# nvme0: NVMe with 15% utilization, 0.5ms wait — healthy
# WARNING signs:
# await > 20ms on SSD — possible I/O bottleneck
# await > 100ms on HDD — severe I/O bottleneck
# %util > 80% — device is heavily loaded
# %util = 100% — device is saturated
iotop: Process-Level I/O Monitoring
# Install
sudo apt install iotop-c # Ubuntu (iotop-c is the modern version)
sudo dnf install iotop # Rocky Linux
# Show only processes doing I/O
sudo iotop -o
# Batch mode (for scripting/logging)
sudo iotop -b -o -n 5 -d 2 # 5 iterations, 2 second delay
# Accumulate I/O (total since start)
sudo iotop -a
Understanding iotop Output
$ sudo iotop -o
Total DISK READ: 15.2 MB/s | Total DISK WRITE: 45.8 MB/s
PID USER DISK READ DISK WRITE COMMAND
1234 mysql 12.0 MB/s 35.0 MB/s mysqld --datadir=/var/lib/mysql
5678 www-data 3.0 MB/s 8.0 MB/s php-fpm: pool www
910 root 0.2 MB/s 2.8 MB/s rsync --archive /backup
# MySQL is the primary I/O consumer — investigate slow queries
Advanced iostat Monitoring
# Per-partition statistics
iostat -xp 2
# JSON output (for automation)
iostat -xz -o JSON 2 1
# Monitor specific device
iostat -xz -d nvme0n1 2
# Historical stats (if sar/sysstat data collection is enabled)
# Enable data collection
sudo systemctl enable --now sysstat
# View historical I/O stats
sar -d -f /var/log/sysstat/sa15 # Day 15 disk stats
sar -d -s 14:00:00 -e 15:00:00 # Today, 2-3 PM
Common I/O Bottleneck Patterns
Database I/O
# Symptoms: high r/s and w/s on data disk, high await
# Diagnosis: check slow query log
sudo iotop -o # Confirm mysqld/postgres is the top I/O consumer
# Solutions:
# - Increase InnoDB buffer pool / shared_buffers
# - Add missing indexes (reduce disk reads)
# - Move WAL/binlog to separate disk
# - Upgrade to NVMe storage
Log Rotation I/O Spike
# Symptoms: periodic high writes, often at midnight
# Check with: sar -d -s 00:00:00 -e 00:30:00
# Solution: use logrotate with copytruncate or compress
Swap Thrashing
# Symptoms: high I/O on swap partition/file
iostat -xz 2 | grep -E "swap|dm-"
# Solution: add more RAM or reduce memory usage
Prometheus Monitoring
# node_exporter provides disk I/O metrics
# Key metrics:
rate(node_disk_read_bytes_total[5m]) # Read throughput
rate(node_disk_written_bytes_total[5m]) # Write throughput
rate(node_disk_io_time_seconds_total[5m]) # I/O utilization
rate(node_disk_read_time_seconds_total[5m]) / rate(node_disk_reads_completed_total[5m]) # Avg read latency
Best Practices
- Monitor
await(I/O latency) as the primary indicator — high await means I/O bottleneck regardless of throughput - Use
iotopto identify which processes are causing high I/O - Enable
sysstatdata collection for historical I/O analysis - Separate database data, logs, and WAL onto different disks when possible
- Use NVMe storage for I/O-intensive workloads — the latency difference is 10-100x vs HDD
- Alert on
node_disk_io_time_seconds_totalutilization > 80% in Prometheus - Check for swap I/O as a sign of memory pressure