Tune Linux TCP Stack Low-Latency

By Admin · Mar 15, 2026 · Updated Apr 23, 2026 · 124 views · 4 min read

The default Linux TCP stack is tuned for general-purpose networking, not low-latency applications. For real-time services, trading platforms, game servers, and high-performance APIs, careful TCP tuning can reduce round-trip times by 20-50% and improve throughput under load. This guide covers the essential kernel parameters for latency-sensitive network workloads.

Congestion Control Algorithm

The choice of congestion control algorithm has the single biggest impact on TCP performance:

# Check available algorithms
sysctl net.ipv4.tcp_available_congestion_control

# Check current algorithm
sysctl net.ipv4.tcp_congestion_control

# BBR (Bottleneck Bandwidth and RTT) — best for most modern workloads
# Developed by Google, optimizes for throughput AND latency
sudo modprobe tcp_bbr
sudo sysctl net.ipv4.tcp_congestion_control=bbr
sudo sysctl net.core.default_qdisc=fq

# Make permanent
echo "net.ipv4.tcp_congestion_control = bbr" | sudo tee -a /etc/sysctl.d/99-network.conf
echo "net.core.default_qdisc = fq" | sudo tee -a /etc/sysctl.d/99-network.conf

BBR advantages over CUBIC (default):

Does not rely on packet loss to detect congestion — better for lossy networks
Achieves higher throughput on long-distance connections
Reduces latency by keeping buffer queues short
Better performance on satellite and mobile connections

Buffer Sizes

# TCP buffer auto-tuning (min, default, max in bytes)
# For a 1Gbps link with 50ms RTT:
# Bandwidth-Delay Product = 1Gbps × 0.05s = 6.25MB

# Receive buffers
sudo sysctl net.ipv4.tcp_rmem="4096 87380 16777216"
sudo sysctl net.core.rmem_max=16777216
sudo sysctl net.core.rmem_default=262144

# Send buffers
sudo sysctl net.ipv4.tcp_wmem="4096 65536 16777216"
sudo sysctl net.core.wmem_max=16777216
sudo sysctl net.core.wmem_default=262144

# For 10Gbps links, increase max to 67108864 (64MB)

Connection Handling

# Increase connection backlog for high-traffic servers
sudo sysctl net.core.somaxconn=65535
sudo sysctl net.ipv4.tcp_max_syn_backlog=65535
sudo sysctl net.core.netdev_max_backlog=65535

# Faster connection recycling
sudo sysctl net.ipv4.tcp_fin_timeout=10      # Default 60
sudo sysctl net.ipv4.tcp_tw_reuse=1          # Reuse TIME_WAIT sockets

# Enable SYN cookies to prevent SYN flood impact
sudo sysctl net.ipv4.tcp_syncookies=1

# Increase local port range for outbound connections
sudo sysctl net.ipv4.ip_local_port_range="1024 65535"

Low-Latency Specific Settings

# Disable Nagle's algorithm at kernel level (application can also do this)
# Most applications set TCP_NODELAY on their sockets, but for system-wide:
# This is better handled per-application with setsockopt(TCP_NODELAY)

# Reduce delayed ACK timeout
sudo sysctl net.ipv4.tcp_delack_min=5

# Enable TCP Fast Open (saves 1 RTT on connection establishment)
sudo sysctl net.ipv4.tcp_fastopen=3  # 1=client, 2=server, 3=both

# Busy polling for ultra-low latency (trades CPU for lower latency)
# Only for dedicated latency-sensitive servers
sudo sysctl net.core.busy_read=50      # microseconds
sudo sysctl net.core.busy_poll=50

Keep-Alive Configuration

# Detect dead connections faster
sudo sysctl net.ipv4.tcp_keepalive_time=60    # Start probes after 60s idle (default 7200)
sudo sysctl net.ipv4.tcp_keepalive_intvl=10   # Probe interval (default 75)
sudo sysctl net.ipv4.tcp_keepalive_probes=6   # Probes before declaring dead (default 9)

# Total detection time: 60 + (10 × 6) = 120 seconds
# Default: 7200 + (75 × 9) = 7875 seconds (over 2 hours!)

Interrupt Coalescing and CPU Affinity

# Reduce interrupt coalescing for lower latency (increases CPU usage)
sudo ethtool -C eth0 rx-usecs 0 tx-usecs 0

# Pin network interrupts to specific CPUs
# Find IRQ numbers for network interface
grep eth0 /proc/interrupts | awk '{print $1}' | tr -d ':'

# Pin IRQ 32 to CPU 0
echo 1 | sudo tee /proc/irq/32/smp_affinity

# Or use irqbalance with hints
# /etc/default/irqbalance
# IRQBALANCE_ARGS="--hintpolicy=exact"

Complete Sysctl Configuration

# /etc/sysctl.d/99-network.conf
# Congestion Control
net.ipv4.tcp_congestion_control = bbr
net.core.default_qdisc = fq

# Buffers
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

# Connections
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.core.netdev_max_backlog = 65535
net.ipv4.tcp_fin_timeout = 10
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 1024 65535

# Low Latency
net.ipv4.tcp_fastopen = 3
net.ipv4.tcp_syncookies = 1

# Keep-Alive
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 6

# Apply all settings
sudo sysctl --system

# Verify
sudo sysctl net.ipv4.tcp_congestion_control

Benchmarking

# Measure latency before and after
# TCP ping
hping3 -S -p 80 -c 100 example.com

# iperf3 for throughput
iperf3 -s  # On server
iperf3 -c server_ip -t 30 -P 4  # On client

# Netperf for request/response latency
netperf -H server_ip -t TCP_RR -l 30 -- -r 1,1

Summary

TCP stack tuning for low latency starts with switching to BBR congestion control, which alone can improve performance by 20%. Buffer sizing should match your bandwidth-delay product, connection handling parameters should be aggressive to handle high-traffic scenarios, and TCP Fast Open saves one round trip for repeat connections. For ultra-low-latency requirements, consider busy polling and interrupt affinity pinning, though these trade CPU cycles for reduced latency. Always benchmark before and after changes with tools like hping3 and iperf3 to quantify the improvements.