Learn 🧠 All Concepts (20) 🤖 What is an LLM? 📚 RAG Explained ⚡ AI Agents 💻 Run AI Locally 🇮🇳 AI in India 📖 Learn Tracks 🔧 DevOps Track ⚙️ AI Ops Track 🗺️ AI Engineer Roadmap
Tools 🔧 AI Tools Directory 🔓 Open Source AI ⭐ Top GitHub Repos ✦ Claude Skill Repos 🚀 Ready-to-Deploy Projects
Build 🏗️ Build Hub 🎯 Master Prompts 🧩 RAG Agents 🚀 App Megaprompts
Workflows ⚡ All Workflows (22) 🎥 Text to Video 🎞️ Image to Video 🔊 Text to Speech ♻️ Automation
Resources 🧪 Colab Notebooks ⚙️ n8n Workflows 📈 Algo Trading 💰 Passive Income
🗂️ Browse All Topics About AItheGuru
Learn Linux for Production Support Karan saves the disk at midnight
Linux for Production Support Ch 12 / 32 Intermediate
💾

Karan saves the disk at midnight

Disk full, systemctl, cron — the three pillars of production maintenance

⏱ 11 min 6 commands 5 takeaways
💾
In this chapter
Karan
DevOps engineer, banking platform
The story

11:58pm. PagerDuty: All services on app-server-03 down. Karan grabbed his laptop.

SSH in. First command: df -h

Filesystem   Size  Used Avail Use% Mounted on
/dev/sda1     50G   50G     0  100% /

Root partition 100% full. Apps cannot write logs, cannot write temp files, cannot start. Everything stops.

Find what ate the disk:

du -sh /* 2>/dev/null | sort -rh | head -10
du -sh /var/log/* 2>/dev/null | sort -rh | head -10
find / -type f -size +500M 2>/dev/null | xargs ls -lh

Found it: /opt/app/logs/debug.log was 47GB. A developer left debug logging on in production for 3 days.

Safe cleanup - always check before deleting:

lsof +L1 /opt/app/logs/debug.log
# Is a process holding this file open?
# Deleting an open file does NOT free disk until the process restarts

App had the file open. Solution: truncate, not delete:

> /opt/app/logs/debug.log
# The > operator empties the file without deleting it
# Works safely even when a process has it open

Disk: 100% to 8% instantly. All services back up by 12:07am, 9 minutes after the alert.

Service management:

systemctl status myapp           # check current state
systemctl start myapp            # start it
systemctl restart myapp          # stop then start
systemctl stop myapp             # stop gracefully
systemctl enable myapp           # start on boot
# Service will not start? Read why:
journalctl -u myapp -n 50
journalctl -u myapp --since "5 minutes ago"

Cron - automated maintenance:

crontab -e    # open your schedule

Cron syntax - 5 fields then command:

minute  hour  day-of-month  month  day-of-week  command
# Clean logs older than 7 days, every night at 2am:
0 2 * * * find /opt/app/logs -name "*.log" -mtime +7 -delete
# Disk check every 15 minutes:
*/15 * * * * df -h | awk '$5+0>=90{print}' >> /var/log/disk_alerts.log
crontab -l    # list all cron jobs

Log rotation - the proper long-term solution:

/opt/app/logs/*.log {
    daily
    rotate 7
    compress
    missingok
}

Logrotate runs daily, compresses old logs, keeps only 7 days. Every production app should have this configured.

Key takeaways

df -h should be the third command in any incident after hostname and uptime

du -sh /* | sort -rh finds what is eating disk space in seconds

The > operator truncates a file to zero without deleting it — safe when process has file open

systemctl status then start then restart then journalctl is the service debug sequence

Cron syntax: minute hour day month weekday — use crontab.guru to build expressions

Commands from this chapter
$ df -h
Disk space on all partitions — check for 100%
$ du -sh /var/log/* | sort -rh | head -10
Find biggest log directories
$ find / -size +500M -type f 2>/dev/null | xargs ls -lh
Find files over 500MB
$ > /path/to/big.log
Truncate log to zero while process runs
$ journalctl -u servicename -n 50
Last 50 lines of a service log
$ crontab -l
List all scheduled cron jobs