Vijay handles his first Linux production incident
Full incident response — applying Windows instincts in a Linux world
Month 3. Vijay was on-call alone for the first time. PagerDuty fired: Payment service down. No Windows machine. Just a Linux terminal.
Same incident. Same logic. Different spellings.
STEP 1: ORIENT YOURSELF
On Windows: RDP in, check Computer Name, check who is logged in.
On Linux:
hostname # confirm which server
whoami # confirm who you are logged in as
uptime # how long running, recent reboot?STEP 2: IS IT A RESOURCE PROBLEM?
On Windows: open Task Manager, check CPU and Memory tabs, check Performance tab.
On Linux:
uptime # load average tells you CPU pressure immediately
free -h # memory available
df -h # disk space - CRITICAL, often missed on WindowsVijay checked df -h first. Habit from training.
/dev/sda1 50G 50G 0 100% /Disk 100% full. This was the problem. On Windows he would have checked Task Manager first and missed it.
STEP 3: FIND WHAT FILLED THE DISK
On Windows: open WinDirStat, wait 3 minutes, find visually.
On Linux (30 seconds total):
du -sh /var/log/* 2>/dev/null | sort -rh | head -10
# 47G /var/log/myapp
# 1.2G /var/log/nginxdu -sh /var/log/myapp/* | sort -rh | head -5
# 47G /var/log/myapp/debug.log47GB debug log. Developer turned on verbose logging and forgot. Same human error as Windows. Different OS.
STEP 4: FIX IT SAFELY
On Windows: File Explorer, find the file, check if anything is using it (handle.exe), delete it.
On Linux:
# Check if a process has the file open:
lsof +L1 /var/log/myapp/debug.log
# If output shows a process: truncate instead of delete
# If no output: safe to delete# Truncate (empties without deleting, process keeps its file handle):
> /var/log/myapp/debug.log# Verify:
df -h /STEP 5: RESTART THE SERVICE
On Windows: Services.msc right-click Restart.
On Linux:
sudo systemctl restart myapp
sudo systemctl status myapp # verify it restarted
journalctl -u myapp -f # watch logs as it startsSTEP 6: VERIFY THE FIX
On Linux:
df -h # confirm disk is now OK
systemctl status myapp # confirm service is running
journalctl -u myapp -n 50 # any new errors?
curl -s -o /dev/null -w "%{http_code}" http://localhost:8080/healthTHE FULL INCIDENT COMPARISON
Phase Windows Linux
Orient RDP, Computer Name hostname, whoami, uptime
Check resources Task Manager top, free -h, df -h
Find big files WinDirStat (3 min wait) du -sh | sort -rh (30 sec)
Check file usage handle.exe from Sysinternals lsof +L1 filename
Free space Delete in Explorer > filename (truncate)
Restart service Services.msc right-click systemctl restart service
Watch logs Event Viewer refresh journalctl -u service -f
Test app Browser or Postman curl http://localhost/pathVijay resolved his first solo Linux incident in 14 minutes. After the incident he wrote in the ticket:
The biggest difference is that Linux tells you the truth faster. df -h is instant. du -sh is 30 seconds. grep finds exactly what you need. There are no loading screens.
Three months in, Vijay had developed genuinely new instincts. Not replacing his Windows knowledge. Adding to it.
The incident logic is identical on Windows and Linux — only the tool names change
df -h takes 1 second and catches disk full issues — check it immediately, before checking Task Manager
lsof +L1 filename replaces handle.exe from Sysinternals — check if a process has a file open
The > filename trick empties a log file safely while a process still has it open
journalctl -u service -f replaces watching Event Viewer refresh — faster and more accurate