Learn 🧠 All Concepts (20) 🤖 What is an LLM? 📚 RAG Explained ⚡ AI Agents 💻 Run AI Locally 🇮🇳 AI in India 📖 Learn Tracks 🔧 DevOps Track ⚙️ AI Ops Track 🗺️ AI Engineer Roadmap
Tools 🔧 AI Tools Directory 🔓 Open Source AI ⭐ Top GitHub Repos ✦ Claude Skill Repos 🚀 Ready-to-Deploy Projects
Build 🏗️ Build Hub 🎯 Master Prompts 🧩 RAG Agents 🚀 App Megaprompts
Workflows ⚡ All Workflows (22) 🎥 Text to Video 🎞️ Image to Video 🔊 Text to Speech ♻️ Automation
Resources 🧪 Colab Notebooks ⚙️ n8n Workflows 📈 Algo Trading 💰 Passive Income
🗂️ Browse All Topics About AItheGuru
Learn Linux for Production Support Vijay monitors servers without Datadog
Linux for Production Support Ch 23 / 32 Intermediate 🪟 Windows → Linux
📊

Vijay monitors servers without Datadog

sar, Netdata, Prometheus, Uptime Kuma — free monitoring for Linux servers

⏱ 13 min 6 commands 5 takeaways
📊
In this chapter
Vijay
Windows engineer, week 7 on Linux
The story

Vijay's team used Datadog and New Relic on Windows — expensive SaaS dashboards that showed everything. His new Linux team monitored everything with free, self-hosted tools. His first week he couldn't see anything. By his second week he had a complete monitoring setup that cost nothing.

THE LINUX MONITORING PHILOSOPHY

On Windows: install an agent, open a GUI, look at graphs.

On Linux: you can look at raw numbers from the command line, or run lightweight dashboards. Both options are always available.

Built-in monitoring — no installation needed:

# The 60-second performance overview:
uptime                          # load average
free -h                         # memory
df -h                           # disk
ss -s                           # network connections
ps aux --sort=-%cpu | head -5   # top CPU processes
# All in one line — paste this at the start of every incident:
echo "=CPU=" && uptime && echo "=MEM=" && free -h && echo "=DISK=" && df -h && echo "=TOP=" && ps aux --sort=-%cpu | head -5

SAR — HISTORICAL PERFORMANCE DATA

sar records system performance every 10 minutes and keeps 30 days of history. Like Windows Performance Monitor with automatic recording.

sar -u 1 5                      # CPU usage every 1 second, 5 readings
sar -r 1 5                      # memory usage every 1 second
sar -d 1 5                      # disk I/O every 1 second
sar -n DEV 1 5                  # network interface stats
# Historical data (yesterday, last 7 days):
sar -u -f /var/log/sysstat/sa$(date -d yesterday +%d)
# Shows CPU usage for every 10-minute interval yesterday
# Install if not present:
sudo apt install sysstat
sudo systemctl enable sysstat && sudo systemctl start sysstat

NETDATA — REAL-TIME DASHBOARD IN ONE COMMAND

Netdata is a real-time monitoring dashboard. Install it and get 2000+ metrics with beautiful graphs, accessible from your browser.

curl https://my-netdata.io/kickstart.sh > /tmp/netdata-kickstart.sh
sudo bash /tmp/netdata-kickstart.sh
# Opens at http://YOUR_SERVER_IP:19999
# Shows: CPU, memory, disk, network, processes, containers — everything
# Free, open source, self-hosted

PROMETHEUS + GRAFANA — PRODUCTION MONITORING STACK

For multi-server monitoring, Prometheus scrapes metrics and Grafana visualises them. This is what most production teams use.

# Quick start with Docker Compose:
mkdir monitoring && cd monitoring
cat > docker-compose.yml << 'EOF'
version: '3'
services:
  prometheus:
    image: prom/prometheus
    ports: ["9090:9090"]
    volumes: ["./prometheus.yml:/etc/prometheus/prometheus.yml"]
  grafana:
    image: grafana/grafana
    ports: ["3000:3000"]
    environment: [GF_SECURITY_ADMIN_PASSWORD=admin]
EOF
cat > prometheus.yml << 'EOF'
scrape_configs:
  - job_name: 'linux'
    static_configs:
      - targets: ['node-exporter:9100']
EOF
docker compose up -d
# Install node_exporter on servers you want to monitor:
wget https://github.com/prometheus/node_exporter/releases/latest/download/node_exporter-linux-amd64.tar.gz
tar xvf node_exporter-linux-amd64.tar.gz
sudo mv node_exporter*/node_exporter /usr/local/bin/
node_exporter &
# Open Grafana at http://localhost:3000
# Add Prometheus as data source (http://prometheus:9090)
# Import dashboard ID 1860 — instant Linux metrics dashboard

UPTIME KUMA — SIMPLE UPTIME MONITORING WITH ALERTS

For monitoring whether services are up and alerting via Telegram, WhatsApp, or email:

docker run -d --restart=always -p 3001:3001 \
    -v uptime-kuma:/app/data \
    --name uptime-kuma louislam/uptime-kuma:1
# Open http://YOUR_SERVER_IP:3001
# Add monitors for your HTTP endpoints, TCP ports, ping checks
# Configure alerts: Telegram bot works great for Indian teams

CUSTOM METRICS SCRIPT — MONITOR WHAT MATTERS TO YOU

#!/bin/bash
# /usr/local/bin/health-check — run every 5 minutes via cron
TELEGRAM_TOKEN="your_bot_token"
CHAT_ID="your_chat_id"
HOST=$(hostname)
alert() {
    curl -s -X POST "https://api.telegram.org/bot${TELEGRAM_TOKEN}/sendMessage" \
        -d chat_id="${CHAT_ID}" \
        -d text="⚠️ ${HOST}: $1" > /dev/null
}
# Check disk
DISK=$(df / | awk 'NR==2{print $5}' | tr -d '%')
[ "$DISK" -gt 85 ] && alert "Disk at ${DISK}%"
# Check memory
MEM_FREE=$(free | awk 'NR==2{printf "%.0f", $4/$2*100}')
[ "$MEM_FREE" -lt 10 ] && alert "Memory only ${MEM_FREE}% free"
# Check services
for SVC in nginx myapp postgresql; do
    systemctl is-active --quiet $SVC || alert "$SVC is DOWN"
done
# Add to cron:
# */5 * * * * /usr/local/bin/health-check

LOG MONITORING — ALERTING ON ERRORS

# Watch a log for errors and send a Telegram alert:
tail -f /opt/app/logs/app.log | while read line; do
    echo "$line" | grep -q "CRITICAL\|OutOfMemoryError\|FATAL" && \
    curl -s "https://api.telegram.org/bot${TOKEN}/sendMessage" \
        -d chat_id="${CHAT_ID}" -d text="🚨 ${line}" > /dev/null
done

Vijay's monitoring setup after 2 weeks: Netdata on every server for real-time visibility, Uptime Kuma for endpoint monitoring with Telegram alerts, sar for historical data, and a custom cron script for disk and service alerts. Total cost: zero rupees.

Key takeaways

sar records system metrics every 10 minutes automatically — query historical CPU/memory/disk data without setting anything up

Netdata gives a real-time browser dashboard with 2000+ metrics from a single install command

Uptime Kuma monitors endpoints and sends Telegram/WhatsApp alerts — free self-hosted equivalent of PagerDuty for small teams

The 60-second paste: uptime && free -h && df -h && ps aux --sort=-%cpu | head -5 — complete system snapshot

A cron script checking disk, memory, and services every 5 minutes covers 90% of production alerts needed

Commands from this chapter
$ sar -u 1 5
CPU usage every 1 second for 5 readings — like Performance Monitor
$ sar -u -f /var/log/sysstat/sa$(date -d yesterday +%d)
Yesterday's CPU history — built-in with no setup
$ curl https://my-netdata.io/kickstart.sh | sudo bash
Install Netdata real-time monitoring dashboard
$ docker run -d -p 3001:3001 louislam/uptime-kuma:1
Start Uptime Kuma for endpoint monitoring with alerts
$ uptime && free -h && df -h && ps aux --sort=-%cpu | head -5
60-second full system health snapshot
$ journalctl --since '1 hour ago' -p err | wc -l
Count errors across all services in the last hour