Test Backup Integrity with Automated Restore Verification

By Admin · Mar 15, 2026 · Updated Jun 25, 2026 · 520 views · 3 min read

Why Test Backup Integrity?

A backup that cannot be restored is not a backup. Studies show that up to 30% of restores fail due to corruption, configuration drift, or incomplete data. Automated restore testing ensures your backups are recoverable before you need them in an emergency.

The Verification Pipeline

Restore — Restore the latest backup to a test environment
Validate — Check data integrity, row counts, and checksums
Report — Log results and alert on failures
Cleanup — Tear down the test environment

MySQL Backup Verification

#!/bin/bash
# /usr/local/bin/verify-mysql-backup.sh
set -euo pipefail

BACKUP_DIR="/var/backups/mysql"
TEST_DB="backup_verify_test"
LOG="/var/log/backup-verify.log"

log() { echo "[$(date)] $1" >> "$LOG"; }

LATEST=$(ls -t "$BACKUP_DIR"/*.sql.gz 2>/dev/null | head -1)
if [ -z "$LATEST" ]; then
    log "ERROR: No backup files found"
    exit 1
fi

BACKUP_AGE=$(( ($(date +%s) - $(stat -c %Y "$LATEST")) / 3600 ))
log "Testing: $LATEST (${BACKUP_AGE}h old)"

mysql -e "DROP DATABASE IF EXISTS $TEST_DB; CREATE DATABASE $TEST_DB;"

RESTORE_START=$(date +%s)
if zcat "$LATEST" | mysql "$TEST_DB" 2>>"$LOG"; then
    RESTORE_TIME=$(( $(date +%s) - RESTORE_START ))
    log "Restore OK in ${RESTORE_TIME}s"
else
    log "ERROR: Restore failed"
    mysql -e "DROP DATABASE IF EXISTS $TEST_DB;"
    exit 1
fi

ERRORS=0
TABLE_COUNT=$(mysql -N -e "SELECT COUNT(*) FROM information_schema.tables WHERE table_schema='$TEST_DB';")
[ "$TABLE_COUNT" -lt 5 ] && ERRORS=$((ERRORS + 1))

for TABLE in users orders products; do
    COUNT=$(mysql -N -e "SELECT COUNT(*) FROM \`$TEST_DB\`.\`$TABLE\`;" 2>/dev/null || echo "0")
    if [ "$COUNT" -eq 0 ]; then
        log "ERROR: Table $TABLE is empty"
        ERRORS=$((ERRORS + 1))
    else
        log "OK: $TABLE has $COUNT rows"
    fi
done

mysql -e "DROP DATABASE IF EXISTS $TEST_DB;"

if [ "$ERRORS" -eq 0 ]; then
    log "SUCCESS: $TABLE_COUNT tables verified in ${RESTORE_TIME}s"
    curl -fsS "https://hc-ping.com/YOUR-UUID" >/dev/null 2>&1
else
    log "FAILED: $ERRORS errors"
    exit 1
fi

File Backup Verification

#!/bin/bash
RESTIC_REPO="rclone:b2-backup:my-server-backup"
export RESTIC_PASSWORD_FILE="/root/.restic-password"
VERIFY_DIR="/tmp/backup-verify-$$"

# Check repository integrity
restic -r "$RESTIC_REPO" check --read-data-subset=5%

# Test restore of critical files
mkdir -p "$VERIFY_DIR"
restic -r "$RESTIC_REPO" restore latest --target "$VERIFY_DIR" \
    --include "/etc/nginx"

for f in /etc/nginx/nginx.conf; do
    if [ -f "$VERIFY_DIR$f" ] && [ -s "$VERIFY_DIR$f" ]; then
        echo "OK: $f restored"
    else
        echo "ERROR: $f missing"
    fi
done

rm -rf "$VERIFY_DIR"

Scheduling

# Daily at 6 AM (after 2 AM backup)
0 6 * * * /usr/local/bin/verify-mysql-backup.sh
30 6 * * * /usr/local/bin/verify-file-backup.sh

Prometheus Metrics

cat > /var/lib/prometheus/node-exporter/backup_verify.prom << EOF
# HELP backup_verify_success Last verification result
# TYPE backup_verify_success gauge
backup_verify_success{type="mysql"} 1
# HELP backup_verify_duration_seconds Time to verify
# TYPE backup_verify_duration_seconds gauge
backup_verify_duration_seconds{type="mysql"} ${RESTORE_TIME}
EOF

Best Practices

Run verification daily for critical databases, weekly for less critical data
Always test on a separate database, never overwrite production
Verify both integrity and restore performance
Alert on failures AND missed verification runs
Keep verification logs for audit trails
Do full manual restores periodically to test complete procedures
Track restore time trends to spot growing backup sizes