How to Create a Disaster Recovery Plan for Your Server
A disaster recovery (DR) plan ensures your Breeze server infrastructure can be restored quickly after a catastrophic failure. Whether it is hardware failure, data corruption, security breach, or accidental deletion, having a documented and tested plan is essential for minimizing downtime and data loss.
Key Metrics to Define
- RTO (Recovery Time Objective) — the maximum acceptable downtime. How quickly must you be back online?
- RPO (Recovery Point Objective) — the maximum acceptable data loss. How much data can you afford to lose?
Example: If your RPO is 1 hour and your RTO is 4 hours, you need backups at least every hour and must be able to restore within 4 hours.
Step 1: Inventory Your Infrastructure
Document everything running on your Breeze server:
# System Information
Operating System: Ubuntu 22.04 LTS
CPU: 4 cores
RAM: 8 GB
Storage: 100 GB NVMe SSD
# Services
- Nginx (web server)
- MySQL 8.0 (database)
- Redis 7 (cache)
- Application (Node.js / Python / PHP)
- Cron jobs: backup.sh (daily), cleanup.sh (weekly)
# Critical Data Locations
- /var/www/myapp (application code)
- /var/lib/mysql (database files)
- /etc/nginx (web server config)
- /etc/letsencrypt (SSL certificates)
- /home/deploy/.ssh (SSH keys)
- /etc/crontab (scheduled tasks)
Step 2: Implement the 3-2-1 Backup Strategy
- 3 copies of your data
- 2 different storage types (local disk + object storage)
- 1 offsite copy (different data center or geographic region)
# Local backup
borg create /backup/local::daily-$(date +%Y%m%d) /var/www /etc
# Sync to offsite location
rsync -avz /backup/local/ backup-user@offsite-server:/backup/remote/
# Upload critical databases to object storage
mysqldump --all-databases | gzip > /tmp/db-$(date +%Y%m%d).sql.gz
s3cmd put /tmp/db-$(date +%Y%m%d).sql.gz s3://backups/mysql/
Step 3: Document the Recovery Procedure
Create a step-by-step runbook that anyone on your team can follow:
## Recovery Runbook
### Scenario: Complete Server Failure
1. Provision a new Breeze instance (same plan as original)
2. SSH into the new server and install base packages:
apt update && apt install -y nginx mysql-server redis-server nodejs
3. Restore configuration files:
borg extract /backup/repo::latest etc/
4. Restore application code:
borg extract /backup/repo::latest var/www/
5. Restore database:
gunzip < /backup/db-latest.sql.gz | mysql
6. Restore SSL certificates:
borg extract /backup/repo::latest etc/letsencrypt/
7. Update DNS records to point to new server IP
8. Restart all services:
systemctl restart nginx mysql redis
9. Verify application is functioning:
curl -I https://yourdomain.com
10. Monitor logs for errors:
tail -f /var/log/nginx/error.log
Step 4: Automate Server Provisioning
Use infrastructure-as-code tools to automate server setup:
#!/bin/bash
# server-setup.sh - Automated Breeze provisioning
set -euo pipefail
# Install packages
apt update && apt install -y \
nginx mysql-server redis-server \
certbot python3-certbot-nginx \
borgbackup rsync
# Configure firewall
ufw allow 22/tcp
ufw allow 80/tcp
ufw allow 443/tcp
ufw --force enable
# Set up MySQL
mysql -e "CREATE DATABASE myapp;"
mysql -e "CREATE USER 'app'@'localhost' IDENTIFIED BY 'secure-password';"
mysql -e "GRANT ALL ON myapp.* TO 'app'@'localhost';"
# Restore from backup
export BORG_REPO="ssh://backup@offsite/~/borg-repo"
export BORG_PASSPHRASE="your-passphrase"
cd / && borg extract ::latest var/www/ etc/nginx/ etc/letsencrypt/
# Restore database
LATEST_DB=$(s3cmd ls s3://backups/mysql/ | tail -1 | awk '{print $4}')
s3cmd get "$LATEST_DB" /tmp/db.sql.gz
gunzip < /tmp/db.sql.gz | mysql myapp
# Start services
systemctl enable --now nginx mysql redis-server
echo "Recovery complete!"
Step 5: Test Your Plan Regularly
A disaster recovery plan is only useful if it works. Schedule quarterly tests:
- Tabletop exercise — walk through the plan on paper with your team
- Backup restore test — restore a backup to a test Breeze instance monthly
- Full DR test — simulate complete failure and recover from scratch quarterly
- Time the recovery — measure if you meet your RTO and RPO targets
DR Plan Checklist
- All critical data identified and backed up according to the 3-2-1 rule
- Recovery runbook documented and accessible to all team members
- Backup encryption keys stored in a separate secure location
- DNS TTL set low enough to allow quick IP changes during failover
- Monitoring in place to detect failures immediately
- Contact list for team members and service providers updated
- Plan reviewed and tested within the last 90 days
A well-prepared disaster recovery plan transforms a potential catastrophe into a manageable incident on your Breeze infrastructure.