Docs / Performance Optimization / Optimize Go Application Performance

Optimize Go Application Performance

By Admin · Mar 15, 2026 · Updated Apr 24, 2026 · 357 views · 3 min read

Go's compiled nature and efficient garbage collector make it performant out of the box, but production applications still benefit significantly from profiling and optimization. This guide covers Go's built-in profiling tools, common performance patterns, and optimization techniques that can reduce latency and memory usage in production deployments.

Profiling with pprof

Go includes a powerful profiling toolkit that runs with minimal overhead in production:

HTTP Server Profiling

// Import the pprof HTTP handlers
import _ "net/http/pprof"

// In your main function, start a debug server
go func() {
    log.Println(http.ListenAndServe("localhost:6060", nil))
}()

// Now you can access profiles at:
// http://localhost:6060/debug/pprof/

CPU Profiling

# Collect 30-second CPU profile
go tool pprof http://localhost:6060/debug/pprof/profile?seconds=30

# Interactive commands in pprof:
(pprof) top 20          # Top 20 CPU consumers
(pprof) list functionName  # Line-by-line for specific function
(pprof) web             # Open flame graph in browser
(pprof) svg > cpu.svg   # Export flame graph

Memory Profiling

# Heap profile (current allocations)
go tool pprof http://localhost:6060/debug/pprof/heap

# Allocs profile (total allocations since start)
go tool pprof http://localhost:6060/debug/pprof/allocs

# In pprof:
(pprof) top 20 -cum    # Sort by cumulative allocations
(pprof) list functionName

# Compare two heap profiles to find leaks
go tool pprof -diff_base=before.prof after.prof

Reducing Allocations

Memory allocations are the most common performance bottleneck in Go. Each allocation creates GC pressure:

sync.Pool for Temporary Objects

var bufPool = sync.Pool{
    New: func() interface{} {
        return new(bytes.Buffer)
    },
}

func handleRequest(w http.ResponseWriter, r *http.Request) {
    buf := bufPool.Get().(*bytes.Buffer)
    buf.Reset()
    defer bufPool.Put(buf)

    // Use buf instead of creating new bytes.Buffer each time
    buf.WriteString("response data")
    w.Write(buf.Bytes())
}

Pre-allocate Slices and Maps

// BAD: grows dynamically, causes multiple allocations
var results []string
for _, item := range items {
    results = append(results, process(item))
}

// GOOD: single allocation
results := make([]string, 0, len(items))
for _, item := range items {
    results = append(results, process(item))
}

// Same for maps
m := make(map[string]int, expectedSize)

Avoid String Concatenation in Loops

// BAD: creates new string each iteration
var s string
for _, item := range items {
    s += item.Name + ","  // O(n²) allocations
}

// GOOD: use strings.Builder
var b strings.Builder
b.Grow(len(items) * 20)  // Pre-allocate estimate
for _, item := range items {
    b.WriteString(item.Name)
    b.WriteByte(',')
}
s := b.String()

Escape Analysis

# See what escapes to the heap
go build -gcflags="-m" ./...

# Verbose output
go build -gcflags="-m -m" ./...

# Common escape causes:
# - Returning pointers to local variables
# - Sending to channels
# - Storing in interface values
# - Closures capturing variables
# - Slices that grow beyond stack size

// Stack allocated (fast)
func sum(a, b int) int {
    return a + b
}

// Heap allocated (slower - pointer escapes)
func newUser(name string) *User {
    u := User{Name: name}  // escapes to heap
    return &u
}

Goroutine Management

// Limit concurrent goroutines with semaphore pattern
func processItems(items []Item) {
    sem := make(chan struct{}, 100) // Max 100 concurrent
    var wg sync.WaitGroup

    for _, item := range items {
        sem         

Was this article helpful?