Karan saves the disk at midnight
Disk full, systemctl, cron — the three pillars of production maintenance
11:58pm. PagerDuty: All services on app-server-03 down. Karan grabbed his laptop.
SSH in. First command: df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 50G 50G 0 100% /Root partition 100% full. Apps cannot write logs, cannot write temp files, cannot start. Everything stops.
Find what ate the disk:
du -sh /* 2>/dev/null | sort -rh | head -10
du -sh /var/log/* 2>/dev/null | sort -rh | head -10
find / -type f -size +500M 2>/dev/null | xargs ls -lhFound it: /opt/app/logs/debug.log was 47GB. A developer left debug logging on in production for 3 days.
Safe cleanup - always check before deleting:
lsof +L1 /opt/app/logs/debug.log
# Is a process holding this file open?
# Deleting an open file does NOT free disk until the process restartsApp had the file open. Solution: truncate, not delete:
> /opt/app/logs/debug.log
# The > operator empties the file without deleting it
# Works safely even when a process has it openDisk: 100% to 8% instantly. All services back up by 12:07am, 9 minutes after the alert.
Service management:
systemctl status myapp # check current state
systemctl start myapp # start it
systemctl restart myapp # stop then start
systemctl stop myapp # stop gracefully
systemctl enable myapp # start on boot# Service will not start? Read why:
journalctl -u myapp -n 50
journalctl -u myapp --since "5 minutes ago"Cron - automated maintenance:
crontab -e # open your scheduleCron syntax - 5 fields then command:
minute hour day-of-month month day-of-week command# Clean logs older than 7 days, every night at 2am:
0 2 * * * find /opt/app/logs -name "*.log" -mtime +7 -delete# Disk check every 15 minutes:
*/15 * * * * df -h | awk '$5+0>=90{print}' >> /var/log/disk_alerts.logcrontab -l # list all cron jobsLog rotation - the proper long-term solution:
/opt/app/logs/*.log {
daily
rotate 7
compress
missingok
}Logrotate runs daily, compresses old logs, keeps only 7 days. Every production app should have this configured.
df -h should be the third command in any incident after hostname and uptime
du -sh /* | sort -rh finds what is eating disk space in seconds
The > operator truncates a file to zero without deleting it — safe when process has file open
systemctl status then start then restart then journalctl is the service debug sequence
Cron syntax: minute hour day month weekday — use crontab.guru to build expressions