Docs / Linux Basics / How to Use awk for Text Processing

How to Use awk for Text Processing

By Admin · Mar 2, 2026 · Updated Apr 23, 2026 · 27 views · 2 min read

What is awk?

awk is a powerful text-processing language built into every Linux system. It excels at extracting and transforming columnar data, making it indispensable for log analysis, CSV processing, and system administration on your Breeze.

Basic Syntax

awk 'pattern { action }' filename

By default, awk splits each line into fields using whitespace. Fields are accessed with $1, $2, etc. $0 represents the entire line.

Printing Specific Columns

# Print the first column
awk '{print $1}' /var/log/auth.log

# Print columns 1 and 4
awk '{print $1, $4}' access.log

# Print with custom separator
awk '{print $1 " -> " $7}' access.log

Setting Field Separators

# Parse /etc/passwd (colon-delimited)
awk -F: '{print $1, $3, $6}' /etc/passwd

# Parse CSV files
awk -F, '{print $2}' data.csv

# Multiple characters as separator
awk -F"::" '{print $1}' file.txt

Pattern Matching

# Lines containing "error"
awk '/error/ {print $0}' /var/log/syslog

# Lines where column 9 is 500 (HTTP status)
awk '$9 == 500 {print $0}' access.log

# Lines where column 3 is greater than 1000
awk '$3 > 1000 {print $1, $3}' data.txt

# Negate a pattern
awk '!/DEBUG/ {print}' app.log

Built-in Variables

  • NR — current line (record) number
  • NF — number of fields in the current line
  • FS — field separator (same as -F)
  • OFS — output field separator
  • FILENAME — current file being processed
# Print line numbers
awk '{print NR, $0}' file.txt

# Print the last field of each line
awk '{print $NF}' file.txt

# Skip the header row
awk 'NR > 1 {print $1, $3}' data.csv

Aggregation and Computation

# Sum a column
awk '{sum += $5} END {print "Total:", sum}' data.txt

# Average
awk '{sum += $3; count++} END {print "Average:", sum/count}' data.txt

# Count lines matching a pattern
awk '/ERROR/ {count++} END {print count}' app.log

# Find the maximum value in column 2
awk 'BEGIN {max=0} $2 > max {max=$2} END {print "Max:", max}' data.txt

BEGIN and END Blocks

awk '
BEGIN { FS=","; OFS="\t"; print "Name\tAge\tCity" }
NR > 1 { print $1, $2, $3 }
END { print "Total records:", NR-1 }
' users.csv

Practical Server Administration Examples

# Top 10 IPs in access log
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10

# Disk usage summary (human-readable)
df -h | awk 'NR>1 {print $5, $6}'

# Memory usage from /proc/meminfo
awk '/MemTotal|MemAvailable/ {print $1, $2/1024, "MB"}' /proc/meminfo

# List users with UID >= 1000
awk -F: '$3 >= 1000 {print $1, $3}' /etc/passwd

Was this article helpful?