Understanding Load Average
uptime
# 10:30:00 up 45 days, load average: 4.50, 3.20, 2.10The three numbers represent 1-minute, 5-minute, and 15-minute averages. A load of 1.0 means one CPU core is fully utilized. On a 4-core server, a load of 4.0 means all cores are busy.
Diagnosis
# Is it CPU, I/O, or both?
top
# Look at: %us (user CPU), %sy (system CPU), %wa (I/O wait)
# Detailed CPU breakdown
mpstat -P ALL 1 5
# I/O statistics
iostat -x 1 5High CPU Load
# Find CPU-hungry processes
ps aux --sort=-%cpu | head -10
# Check for runaway processes
top -b -n 1 | awk 'NR>7 && $9>50 {print $1, $9, $11}'High I/O Wait
# Find processes doing heavy I/O
iotop -o
# Check disk queue depth
iostat -x 1 | awk '$1=="sda" || $1=="nvme0n1p1\" {print "await:", $10, "queue:", $9}'Quick Fixes
- Runaway process — kill it:
kill -TERM PID - Too many PHP workers — reduce
pm.max_children - Database queries — check slow query log, add indexes
- Log writing storm — verbose logging can cause I/O pressure