Docs / Performance Optimization / How to Use perf for Linux Performance Analysis

How to Use perf for Linux Performance Analysis

By Admin · Mar 2, 2026 · Updated Apr 23, 2026 · 31 views · 3 min read

How to Use perf for Linux Performance Analysis

The perf tool is the official Linux kernel profiler, providing hardware-level performance counters, CPU sampling, and tracing capabilities. It is invaluable for diagnosing CPU bottlenecks, cache misses, and system-level performance issues on your Breeze server.

Installing perf

# On Ubuntu/Debian
sudo apt install -y linux-tools-common linux-tools-$(uname -r)

# Verify installation
perf --version

Basic CPU Profiling

Record CPU samples for a running process or command:

# Profile a specific command
sudo perf record -g ./my-application

# Profile an already-running process by PID
sudo perf record -g -p $(pgrep nginx) -- sleep 30

# Profile the entire system for 10 seconds
sudo perf record -g -a -- sleep 10

The -g flag captures call graphs, which are essential for understanding where time is spent.

Viewing the Report

# Interactive TUI report
sudo perf report

# Text output sorted by overhead
sudo perf report --stdio --sort=overhead

# Show call graph in text mode
sudo perf report --stdio --call-graph=flat

Navigate the TUI with arrow keys. Press Enter to expand functions and see callers/callees.

Real-Time Monitoring with perf top

See which functions consume the most CPU in real time:

# System-wide live CPU profiling
sudo perf top

# Filter to a specific process
sudo perf top -p $(pgrep php-fpm)

# Show call graph
sudo perf top -g

Counting Hardware Events

Use perf stat to count specific hardware performance counters:

# Count cycles, instructions, cache misses for a command
sudo perf stat ./my-application

# Detailed statistics
sudo perf stat -d ./my-application

# Specific events
sudo perf stat -e cache-misses,cache-references,instructions,cycles ./my-application

Sample output:

 Performance counter stats for './my-application':
     1,234,567,890      cycles
       987,654,321      instructions              #    0.80  insn per cycle
         5,432,100      cache-references
           123,456      cache-misses              #    2.27 % of all cache refs

Generating Flame Graphs

Flame graphs provide the most intuitive visualization of CPU profiles:

# Clone Brendan Gregg's flame graph tools
git clone https://github.com/brendangregg/FlameGraph.git

# Record profile data
sudo perf record -F 99 -g -a -- sleep 30

# Generate the flame graph
sudo perf script | ./FlameGraph/stackcollapse-perf.pl | ./FlameGraph/flamegraph.pl > flamegraph.svg

Open the resulting SVG in a browser. Wide bars indicate functions consuming the most CPU time. Click to zoom into specific call stacks.

Tracing System Calls

# Trace system calls for a process
sudo perf trace -p $(pgrep myapp) --duration 10

# Summarize syscall statistics
sudo perf trace -s -p $(pgrep myapp) -- sleep 10

Profiling Specific Scenarios

Common use cases for perf on a Breeze server:

  • High CPU from unknown source — use perf top to identify the hot function immediately
  • Slow disk I/O — trace block I/O events: sudo perf record -e block:block_rq_issue -a -- sleep 10
  • Network latency — trace network events: sudo perf record -e net:* -a -- sleep 10
  • Lock contention — profile futex calls: sudo perf record -e 'sched:sched_switch' -g -a -- sleep 10
  • Memory allocation pressure — trace page faults: sudo perf stat -e page-faults ./my-application

Best Practices

  • Use -g for call graphs — without call graphs, you see hot functions but not why they are called
  • Install debug symbolsapt install -dbgsym for meaningful function names instead of hex addresses
  • Profile under realistic load — use a load testing tool to generate traffic while profiling
  • Keep recording short — 10-30 seconds is usually sufficient; long recordings produce unwieldy data files
  • Compare before and after — record a baseline profile, make your optimization, then record again to measure the improvement

Was this article helpful?