Why Uptime Monitoring?
You need to know when your services go down — before your customers tell you.
Simple HTTP Health Checks
Endpoint Design
// /health endpoint in your app
app.get('/health', async (req, res) => {
try {
// Check database connection
await db.query('SELECT 1');
// Check Redis connection
await redis.ping();
res.json({
status: 'healthy',
uptime: process.uptime(),
timestamp: new Date().toISOString()
});
} catch (err) {
res.status(503).json({
status: 'unhealthy',
error: err.message
});
}
});
What to Check
| Check | What It Validates |
|---|---|
| HTTP 200 response | Web server is running |
| Database query | Database connection works |
| Redis ping | Cache layer is available |
| Disk space | Storage isn't full |
| Memory usage | Not running out of RAM |
| Response time | Performance is acceptable |
Self-Hosted: Uptime Kuma
# docker-compose.yml
services:
uptime-kuma:
image: louislam/uptime-kuma:latest
ports:
- "3001:3001"
volumes:
- uptime-data:/app/data
restart: unless-stopped
volumes:
uptime-data:
Features:
- HTTP, TCP, Ping, DNS monitoring
- Beautiful dashboard
- Notifications via email, Slack, Discord, Telegram
- Status pages for customers
- Certificate expiry monitoring
Alerting Best Practices
Notification Channels
Critical (server down): SMS + Phone call + Slack
Warning (high CPU): Slack + Email
Info (deployment): Slack only
Alert Fatigue Prevention
| Rule | Why |
|---|---|
| Only alert on actionable events | Non-actionable alerts get ignored |
| Set appropriate thresholds | 80% CPU for 1 second isn't a problem |
| Use time-based conditions | Alert if high CPU for 5+ minutes |
| Batch similar alerts | Don't send 100 emails for the same issue |
| Have an escalation path | If not acknowledged in 15 min, escalate |
Monitoring Checklist
| Service | Check | Frequency |
|---|---|---|
| Website | HTTP 200 | Every 1 min |
| API | Health endpoint | Every 1 min |
| Database | Connection + query | Every 5 min |
| SSL cert | Expiry date | Daily |
| Disk space | Usage percentage | Every 5 min |
| Backups | Last successful run | Daily |
Tip Monitor from outside your infrastructure. If your monitoring server is on the same network as your web server, a network outage takes down both — and you won't get alerted.