Keeping servers patched is critical for security but manually updating multiple servers is time-consuming and error-prone. Automated patching ensures consistent security updates across your entire fleet while minimizing downtime through rolling updates and automatic rollback. This guide covers building a robust patching automation system.
Patching Strategy
- Security patches: Apply within 24-72 hours of release
- Minor updates: Weekly or bi-weekly maintenance window
- Major upgrades: Quarterly, with testing in staging first
- Kernel updates: Monthly, require reboot scheduling
Ansible-Based Patching Automation
# playbooks/patch-servers.yml
---
- name: Automated Server Patching
hosts: all
become: yes
serial: "30%" # Patch 30% of servers at a time
max_fail_percentage: 10
vars:
reboot_required: false
patch_timeout: 600
exclude_packages:
- mysql-server # Don't auto-update databases
- postgresql*
pre_tasks:
- name: Create pre-patch snapshot/backup point
shell: |
echo "Pre-patch checkpoint: $(date)" >> /var/log/patch-history.log
# If using LVM, create snapshot
# lvcreate -L 5G -s -n pre-patch /dev/vg0/root
- name: Check current kernel version
command: uname -r
register: pre_patch_kernel
changed_when: false
tasks:
# Debian/Ubuntu patching
- name: Update apt cache
apt:
update_cache: yes
cache_valid_time: 3600
when: ansible_os_family == "Debian"
- name: Install security updates (Debian/Ubuntu)
apt:
upgrade: safe
autoremove: yes
autoclean: yes
register: apt_result
when: ansible_os_family == "Debian"
# RHEL/CentOS/Rocky patching
- name: Install security updates (RHEL)
dnf:
name: "*"
state: latest
security: yes
exclude: "{{ exclude_packages | join(',') }}"
register: dnf_result
when: ansible_os_family == "RedHat"
- name: Check if reboot is required
stat:
path: /var/run/reboot-required
register: reboot_file
when: ansible_os_family == "Debian"
- name: Set reboot flag
set_fact:
reboot_required: true
when: >
(reboot_file.stat.exists is defined and reboot_file.stat.exists) or
(dnf_result.changed is defined and dnf_result.changed)
- name: Reboot if required
reboot:
reboot_timeout: 300
pre_reboot_delay: 5
post_reboot_delay: 30
msg: "Automated patching reboot"
when: reboot_required | bool
- name: Wait for server to be fully available
wait_for_connection:
delay: 10
timeout: 300
when: reboot_required | bool
post_tasks:
- name: Verify services are running
systemd:
name: "{{ item }}"
state: started
loop: "{{ critical_services | default(['sshd']) }}"
- name: Run health check
uri:
url: "http://localhost:{{ health_check_port | default(80) }}/health"
status_code: 200
when: health_check_port is defined
retries: 5
delay: 10
- name: Log patch completion
lineinfile:
path: /var/log/patch-history.log
line: "Patched successfully at {{ ansible_date_time.iso8601 }}"
create: yes
Scheduling with Cron
#!/bin/bash
# /opt/scripts/run-patching.sh
set -euo pipefail
LOG="/var/log/automated-patching-$(date +%Y%m%d).log"
INVENTORY="/opt/ansible/inventory/production"
PLAYBOOK="/opt/ansible/playbooks/patch-servers.yml"
echo "=== Patching started at $(date) ===" | tee -a "$LOG"
cd /opt/ansible
ansible-playbook "$PLAYBOOK" \
-i "$INVENTORY" \
--extra-vars "patch_date=$(date +%Y%m%d)" \
-v 2>&1 | tee -a "$LOG"
EXIT_CODE=${PIPESTATUS[0]}
if [ $EXIT_CODE -eq 0 ]; then
echo "=== Patching completed successfully at $(date) ===" | tee -a "$LOG"
else
echo "=== Patching FAILED with exit code $EXIT_CODE at $(date) ===" | tee -a "$LOG"
# Send alert
curl -X POST "https://hooks.slack.com/services/YOUR/WEBHOOK/URL" \
-H 'Content-type: application/json' \
-d "{\"text\": \"Server patching FAILED. Check $LOG\"}"
fi
# Crontab entry: run every Sunday at 3 AM
# 0 3 * * 0 /opt/scripts/run-patching.sh
Patch Compliance Reporting
# playbooks/patch-audit.yml
---
- name: Patch Compliance Audit
hosts: all
become: yes
gather_facts: yes
tasks:
- name: Check available updates (Debian)
shell: apt list --upgradable 2>/dev/null | tail -n +2 | wc -l
register: available_updates
changed_when: false
when: ansible_os_family == "Debian"
- name: Check last patch date
shell: stat -c %Y /var/log/apt/history.log 2>/dev/null || echo "0"
register: last_patch_timestamp
changed_when: false
when: ansible_os_family == "Debian"
- name: Check reboot required
stat:
path: /var/run/reboot-required
register: needs_reboot
- name: Generate report
set_fact:
patch_report:
hostname: "{{ inventory_hostname }}"
os: "{{ ansible_distribution }} {{ ansible_distribution_version }}"
kernel: "{{ ansible_kernel }}"
pending_updates: "{{ available_updates.stdout | default('N/A') }}"
needs_reboot: "{{ needs_reboot.stat.exists | default(false) }}"
uptime: "{{ ansible_uptime_seconds | int // 86400 }} days"
- name: Display report
debug:
var: patch_report
# Run: ansible-playbook patch-audit.yml -i inventory/production
Best Practices
- Patch staging first: Always test patches in a non-production environment
- Use rolling updates with
serialto avoid taking down all servers simultaneously - Exclude critical packages like database servers from automatic updates
- Schedule reboots during maintenance windows for kernel updates
- Monitor after patching: Run health checks and verify services are running
- Keep audit logs of all patching activities for compliance
- Set up alerts for failed patching runs via Slack, email, or PagerDuty