A bare metal recovery plan documents the complete process for rebuilding a server from nothing — new hardware, empty disks, and only your backups. This is the ultimate test of your backup strategy. This guide covers creating, documenting, and testing a bare metal recovery plan for VPS and dedicated server infrastructure.
What to Back Up
# Complete server inventory checklist:
# 1. Operating system configuration
# - /etc/ (system config, users, groups, services)
# - Package list: dpkg --get-selections > packages.list
# - Kernel parameters: sysctl -a > sysctl.conf.bak
# - Firewall rules: iptables-save > firewall.rules
# 2. Application data
# - /var/www/ (web applications)
# - /opt/ (installed applications)
# - /home/ (user data)
# - Application-specific configs
# 3. Databases
# - MySQL: mysqldump --all-databases
# - PostgreSQL: pg_dumpall
# - Redis: BGSAVE + copy dump.rdb
# 4. SSL certificates
# - /etc/letsencrypt/ (Let's Encrypt certs)
# - Any custom certificates
# 5. Cron jobs
# - /etc/crontab, /etc/cron.d/*, /var/spool/cron/
# 6. Service configurations
# - Nginx/Apache configs
# - PHP-FPM pool configs
# - Systemd service files
# - Docker volumes and compose files
Automated Inventory Script
#!/bin/bash
# /usr/local/bin/server-inventory.sh
BACKUP_DIR="/backup/inventory/$(date +%Y%m%d)"
mkdir -p "$BACKUP_DIR"
# Package list
dpkg --get-selections > "$BACKUP_DIR/packages.list"
pip3 list --format=freeze > "$BACKUP_DIR/pip-packages.list" 2>/dev/null
npm list -g --json > "$BACKUP_DIR/npm-global.json" 2>/dev/null
# System config
cp -a /etc/fstab "$BACKUP_DIR/"
cp -a /etc/hosts "$BACKUP_DIR/"
cp -a /etc/hostname "$BACKUP_DIR/"
sysctl -a > "$BACKUP_DIR/sysctl.conf" 2>/dev/null
iptables-save > "$BACKUP_DIR/iptables.rules" 2>/dev/null
ss -tlnp > "$BACKUP_DIR/listening-ports.txt"
# Service list
systemctl list-unit-files --state=enabled > "$BACKUP_DIR/enabled-services.txt"
# Cron jobs
for user in $(cut -d: -f1 /etc/passwd); do
crontab -u "$user" -l > "$BACKUP_DIR/cron-$user.txt" 2>/dev/null
done
cp -a /etc/cron.d/ "$BACKUP_DIR/cron.d/"
# Docker
docker ps --format "table {{.Names}}\t{{.Image}}\t{{.Ports}}" > "$BACKUP_DIR/docker-containers.txt" 2>/dev/null
# Network
ip addr > "$BACKUP_DIR/network.txt"
ip route > "$BACKUP_DIR/routes.txt"
echo "Inventory saved to $BACKUP_DIR"
Recovery Runbook
# BARE METAL RECOVERY RUNBOOK
# Estimated time: 2-4 hours
# Step 1: Provision new server (15 min)
# - Order VPS or provision bare metal
# - Install base OS (Ubuntu 24.04 LTS / AlmaLinux 9)
# - Set hostname, timezone, locale
# Step 2: System packages (15 min)
apt update && apt upgrade -y
xargs -a packages.list apt install -y
# Or use configuration management (Ansible)
# Step 3: Restore system configuration (20 min)
# Restore /etc/ files from backup
restic restore latest --target / --include /etc/nginx --include /etc/php
# Restore users and groups
# Restore SSH keys and authorized_keys
# Step 4: Restore application data (30-60 min)
restic restore latest --target / --include /var/www --include /opt/apps
# Step 5: Restore databases (15-30 min)
mysql < /backup/dumps/mysql-latest.sql
pg_restore -d mydb /backup/dumps/postgres-latest.dump
# Step 6: Restore SSL certificates (5 min)
restic restore latest --target / --include /etc/letsencrypt
# Or re-issue with certbot
# Step 7: Start services (10 min)
systemctl enable --now nginx php8.3-fpm mysql postgresql redis
# Verify each service
# Step 8: Verify (30 min)
# - Test each website/application
# - Check database connectivity
# - Verify SSL certificates
# - Test email functionality
# - Check cron jobs
# - Verify monitoring
Testing the Plan
# Quarterly recovery test procedure:
# 1. Provision a test VPS (same specs as production)
# 2. Follow the runbook step by step
# 3. Time each step
# 4. Note any missing steps or failed procedures
# 5. Update the runbook
# 6. Destroy the test VPS
# Automated test with Vagrant
Vagrantfile:
Vagrant.configure("2") do |config|
config.vm.box = "ubuntu/noble64"
config.vm.provision "shell", path: "recovery-test.sh"
end
vagrant up
# Script should complete without errors
vagrant destroy -f
Recovery Time Objectives
| Component | RTO Target | RPO Target |
|---|---|---|
| Web applications | 1 hour | 1 hour (hourly backups) |
| Databases | 30 minutes | 15 minutes (WAL shipping) |
| 2 hours | Daily | |
| Full server | 4 hours | 1 hour |
Summary
A bare metal recovery plan is only as good as your last test. Document every step required to rebuild your server from scratch, automate what you can with configuration management tools, and test quarterly by actually performing the recovery on a fresh server. The goal is to reduce RTO (recovery time) to hours instead of days, and ensure that no critical data or configuration is missing from your backups.