A segmentation fault (segfault) occurs when a program tries to access memory it's not allowed to. On servers, segfaults crash applications and services, causing outages. This guide covers identifying the cause of segfaults, generating useful crash information, and applying fixes.
What Causes Segfaults?
- Null pointer dereference: Accessing memory at address 0x0
- Buffer overflow: Writing beyond allocated memory bounds
- Use-after-free: Accessing memory that's been deallocated
- Stack overflow: Exceeding the stack size limit
- Corrupted shared libraries: Damaged .so files
- Hardware issues: Failing RAM causing memory corruption
Step 1: Gather Crash Information
# Check system logs for segfault messages
dmesg | grep -i "segfault"
journalctl | grep -i "segfault"
# Typical segfault log entry:
# nginx[12345]: segfault at 0 ip 00007f8a1234 sp 00007ffc5678 error 4 in libc.so.6[7f8a0000+1e5000]
# Decode the error code:
# error 4 = read access (bit 2) in user space (bit 1=0) at unmapped address (bit 0=0)
# Bit 0: 0=not-present page, 1=protection fault
# Bit 1: 0=read, 1=write
# Bit 2: 0=kernel-mode, 1=user-mode
# Check for core dumps
ls -la /var/lib/systemd/coredump/
coredumpctl list
coredumpctl info PID
Step 2: Enable Core Dumps
# Check current core dump settings
ulimit -c # Should not be 0
# Enable core dumps
ulimit -c unlimited
# For systemd services, set in the service file:
[Service]
LimitCORE=infinity
# Configure systemd-coredump
cat /etc/systemd/coredump.conf
# [Coredump]
# Storage=external
# Compress=yes
# ProcessSizeMax=2G
# ExternalSizeMax=2G
# Or configure core dump pattern directly
echo '/tmp/core.%e.%p.%t' | sudo tee /proc/sys/kernel/core_pattern
# Reproduce the crash
sudo systemctl restart problematic-service
# Check for core dump
ls -la /tmp/core.*
coredumpctl list
Step 3: Analyze with GDB
# Install debugging tools
sudo apt install gdb # Debian/Ubuntu
sudo dnf install gdb # RHEL/Rocky
# Install debug symbols for the crashed binary
sudo apt install nginx-dbg # Nginx debug symbols
sudo apt install libc6-dbg # glibc debug symbols
# Analyze core dump
gdb /usr/sbin/nginx /tmp/core.nginx.12345.1710489600
# In GDB:
(gdb) bt # Show backtrace (most important!)
(gdb) bt full # Show backtrace with local variables
(gdb) info threads # List all threads
(gdb) thread 2 # Switch to thread 2
(gdb) bt # Backtrace for that thread
(gdb) frame 3 # Go to frame 3 in the backtrace
(gdb) info locals # Show local variables
(gdb) print variable # Print a specific variable
# Using coredumpctl directly
coredumpctl gdb PID
coredumpctl debug PID
Step 4: Common Fixes
Corrupted Libraries
# Check if shared libraries are valid
ldd /usr/sbin/nginx
# "not found" = missing library
# Verify library integrity
debsums --changed # Debian/Ubuntu
rpm -Va # RHEL/Rocky
# Reinstall damaged packages
sudo apt install --reinstall nginx libc6
sudo dnf reinstall nginx glibc
Memory Issues
# Test RAM for errors
sudo memtest86+ # From boot menu
# Check for memory-related kernel messages
dmesg | grep -i "memory\|edac\|mce\|hardware error"
# Use AddressSanitizer for custom applications
gcc -fsanitize=address -g -o myapp myapp.c
# or
export ASAN_OPTIONS=detect_leaks=1:log_path=/tmp/asan
./myapp
Stack Size Issues
# Check current stack size
ulimit -s # Default: 8192 (8MB)
# Increase stack size for a service
[Service]
LimitSTACK=16777216 # 16MB
# Or globally
echo "* soft stack 16384" | sudo tee -a /etc/security/limits.conf
Application-Specific Fixes
# PHP segfaults: often extension-related
php -m # List loaded modules
# Disable extensions one by one to find the culprit
# Python segfaults: usually in C extensions
python3 -c "import faulthandler; faulthandler.enable()"
# Node.js segfaults: usually native addons
node --report-on-fatalerror app.js
# Generates a diagnostic report on crash
Prevention
# Keep software updated — segfaults are often fixed in patches
sudo apt update && sudo apt upgrade
# Enable ASLR (Address Space Layout Randomization)
echo 2 | sudo tee /proc/sys/kernel/randomize_va_space
# Monitor for segfaults
# Add to your monitoring:
journalctl -f | grep --line-buffered "segfault" | while read line; do
echo "SEGFAULT DETECTED: $line" | mail -s "Segfault Alert" admin@example.com
done
Best Practices
- Always get a backtrace:
gdb binary corefilethenbt— this tells you exactly where the crash happened - Install debug symbols for meaningful backtraces instead of memory addresses
- Check for known bugs — search the backtrace output in the project's bug tracker
- Update first: Many segfaults are fixed in newer versions of the software
- Test RAM: Random, hard-to-reproduce segfaults across different programs often indicate bad memory
- Report bugs: If you find a genuine segfault in open-source software, report it with your backtrace