Learn › Linux for Production Support › Arjun becomes a log wizard

Linux for Production Support Ch 15 / 32 Intermediate

📝

Arjun becomes a log wizard

awk, sed, cut, sort, uniq — turn any log file into a data report in seconds

⏱ 12 min 5 commands 5 takeaways

📝

In this chapter

Arjun

Support engineer, data-heavy fintech team

The story

Arjun's manager sent him a 2GB log file and asked: How many unique users had errors today? Which error type happened most? What was the peak error hour?

Two months ago he would have opened it in a text editor, searched manually, and spent 3 hours. Today he opened a terminal and had all three answers in 45 seconds.

The log file format:

2026-03-17 14:23:45 ERROR user_id=10234 PaymentService NullPointerException
2026-03-17 14:23:46 INFO  user_id=10234 PaymentService request completed
2026-03-17 14:24:01 ERROR user_id=98712 AuthService TokenExpiredException

Question 1: How many unique users had errors?

grep "ERROR" app.log | grep -oP "user_id=\K[0-9]+" | sort -u | wc -l

Breaking this down:

grep "ERROR" keeps only error lines
grep -oP extracts just the user ID number using Perl regex
sort -u sorts and removes duplicates
wc -l counts remaining lines

Result: 847 unique users. 45 seconds.

Question 2: Which error type happened most?

grep "ERROR" app.log | awk '{print $NF}' | sort | uniq -c | sort -rn | head -10

Breaking this down:

awk '{print $NF}' prints the last field (the exception class name)
sort groups identical values together
uniq -c counts consecutive duplicates
sort -rn sorts by count, highest first
head -10 shows top 10

Result:

1847 NullPointerException
 923 TokenExpiredException
 412 DatabaseTimeoutException

Question 3: What was the peak error hour?

grep "ERROR" app.log | cut -d' ' -f2 | cut -d: -f1 | sort | uniq -c | sort -rn

cut -d' ' -f2 gets the time field (14:23:45)
cut -d: -f1 gets just the hour (14)

The awk command in depth. Every line is split into fields: $1, $2, and so on. $NF is the last field.

# Print columns 1 and 3
awk '{print $1, $3}' app.log

# Filter and print
awk '$3 == "ERROR" {print $0}' app.log

# Sum response times from column 8
awk '{sum += $8; count++} END {print "Average:", sum/count}' response.log

# Count errors per service
awk '$3 == "ERROR" {services[$4]++} END {for (s in services) print services[s], s}' app.log | sort -rn

The sed command makes substitutions in text:

# Replace ERROR with CRITICAL
sed 's/ERROR/CRITICAL/g' app.log

# Delete DEBUG lines
sed '/DEBUG/d' app.log

# In-place edit (modifies the file directly):
sed -i 's/old_hostname/new_hostname/g' config.xml

# Remove blank lines
sed '/^$/d' app.log

Building a daily report script:

#!/bin/bash
LOG="/opt/app/logs/app.log"
DATE=$(date +%Y-%m-%d)

echo "=== Daily Report: $DATE ==="
echo "Total requests:"
grep "$DATE" "$LOG" | wc -l

echo "Error breakdown:"
grep "$DATE" "$LOG" | grep "ERROR" | awk '{print $NF}' | sort | uniq -c | sort -rn | head -10

echo "Unique users with errors:"
grep "$DATE" "$LOG" | grep "ERROR" | grep -oP "user_id=\K[0-9]+" | sort -u | wc -l

Arjun runs this script every morning. The whole report generates in 3 seconds.

Key takeaways

awk processes columns: $1 is field 1, $NF is last field, the END block runs after all lines

sort | uniq -c | sort -rn is the most useful pipeline for counting anything in logs

grep -oP with Perl regex extracts specific patterns like IDs or values from log lines

sed 's/old/new/g' does text replacement — add -i flag to edit files directly in place

cut -d'delimiter' -f1 splits on a character and picks a column — simpler than awk for simple cases

Commands from this chapter

$ grep 'ERROR' app.log | awk '{print $NF}' | sort | uniq -c | sort -rn | head -10

Count errors by type, most frequent first

$ grep -oP 'user_id=\K[0-9]+' app.log | sort -u | wc -l

Count unique users who had errors

$ awk '$3=="ERROR"{s[$4]++} END{for(k in s) print s[k],k}' app.log | sort -rn

Error count per service

$ sed -i 's/old_hostname/new_hostname/g' config.xml

In-place find and replace in config file

$ cut -d' ' -f2 app.log | cut -d: -f1 | sort | uniq -c | sort -rn

Error count per hour of day