Understanding server metrics is a fundamental skill for any VPS owner. Whether you're troubleshooting a slow application or planning a server upgrade, being able to read CPU, memory, disk, and network metrics will help you make informed decisions. This guide explains each metric in plain language with practical commands.
CPU Metrics
CPU usage tells you how much processing power your applications are consuming. It's measured as a percentage of available CPU time.
Reading CPU Usage with top and htop
# Basic CPU overview
top -bn1 | head -5
# Output explained:
# %us (user) — Time spent on your applications
# %sy (system) — Time spent on kernel/OS tasks
# %ni (nice) — Time spent on low-priority processes
# %id (idle) — Available CPU time (higher = better)
# %wa (wait) — Time waiting for disk I/O (high = disk bottleneck)
# %st (steal) — Time "stolen" by hypervisor (high = noisy neighbor)
# Install htop for a better visual experience
sudo apt install htop # Ubuntu/Debian
sudo dnf install htop # AlmaLinux/Rocky
htop
What the Numbers Mean
- 0-30% CPU usage — Healthy, plenty of headroom
- 30-70% CPU usage — Normal under load, monitor for trends
- 70-90% CPU usage — Getting heavy, consider optimization or scaling
- 90-100% sustained — Critical, applications will slow down
- High %wa (>20%) — Your disk is the bottleneck, not CPU
- High %st (>10%) — Hypervisor contention, contact your provider
Memory Metrics
Linux memory management confuses many beginners because Linux intentionally uses nearly all available RAM for disk caching. This is normal and beneficial.
Reading Memory with free
free -h
# Output:
# total used free shared buff/cache available
# Mem: 7.8Gi 2.1Gi 1.2Gi 128Mi 4.5Gi 5.3Gi
# Swap: 2.0Gi 0B 2.0Gi
# Key insight: Look at "available", NOT "free"
# available = free + reclaimable cache
# In this example, 5.3GB is actually available for applications
When to Worry About Memory
- "available" > 20% of total — Healthy
- "available" 10-20% of total — Monitor closely
- "available" < 10% of total — Consider adding RAM or optimizing
- Swap usage increasing — RAM is insufficient, performance will degrade
- OOM killer triggered — Check dmesg for "Out of memory" messages
Disk Metrics
Disk Space
# Check disk space usage
df -h
# Find what's using space
du -sh /var/log /var/lib /home /tmp /opt 2>/dev/null | sort -rh
# Find large files (>100MB)
find / -type f -size +100M -exec ls -lh {} ; 2>/dev/null
Disk I/O Performance
# Real-time disk I/O monitoring
iostat -x 1 5
# Key columns:
# r/s, w/s — Reads/writes per second
# rkB/s, wkB/s — KB read/written per second
# await — Average time per I/O request (ms)
# %util — How busy the disk is (100% = saturated)
# If await > 20ms consistently, your disk is under pressure
Network Metrics
# Real-time network bandwidth
sudo apt install iftop # then: sudo iftop
sudo apt install nload # then: nload
# Network statistics
vnstat # Daily/monthly traffic summary
vnstat -l # Live monitoring
ss -tuln # List listening ports
ss -s # Connection summary statistics
Load Average Explained
# Check load average
uptime
# Output: load average: 1.25, 0.89, 0.67
# These are 1-minute, 5-minute, and 15-minute averages
# Load average represents the number of processes waiting for CPU
# On a 4-vCPU server:
# Load 0-4: Normal (at most one process per CPU)
# Load 4-8: Heavy (processes are queuing)
# Load 8+: Overloaded (significant queuing)
Quick Health Check Script
#!/bin/bash
# Save as /usr/local/bin/server-health
echo "=== Server Health Check ==="
echo ""
echo "--- Uptime & Load ---"
uptime
echo ""
echo "--- Memory ---"
free -h
echo ""
echo "--- Disk Space ---"
df -h /
echo ""
echo "--- Top 5 CPU Consumers ---"
ps aux --sort=-%cpu | head -6
echo ""
echo "--- Top 5 Memory Consumers ---"
ps aux --sort=-%mem | head -6
echo ""
echo "--- Network Connections ---"
ss -s
Save this script and run it whenever you need a quick overview of your server's health. Over time, you'll develop an intuition for what's normal and what needs attention.