Learn › Linux for Production Support › Vijay monitors servers without Datadog

Linux for Production Support Ch 23 / 32 Intermediate 🪟 Windows → Linux

📊

Vijay monitors servers without Datadog

sar, Netdata, Prometheus, Uptime Kuma — free monitoring for Linux servers

⏱ 13 min 6 commands 5 takeaways

📊

In this chapter

Vijay

Windows engineer, week 7 on Linux

The story

Vijay's team used Datadog and New Relic on Windows — expensive SaaS dashboards that showed everything. His new Linux team monitored everything with free, self-hosted tools. His first week he couldn't see anything. By his second week he had a complete monitoring setup that cost nothing.

THE LINUX MONITORING PHILOSOPHY

On Windows: install an agent, open a GUI, look at graphs.

On Linux: you can look at raw numbers from the command line, or run lightweight dashboards. Both options are always available.

Built-in monitoring — no installation needed:

# The 60-second performance overview:
uptime                          # load average
free -h                         # memory
df -h                           # disk
ss -s                           # network connections
ps aux --sort=-%cpu | head -5   # top CPU processes

# All in one line — paste this at the start of every incident:
echo "=CPU=" && uptime && echo "=MEM=" && free -h && echo "=DISK=" && df -h && echo "=TOP=" && ps aux --sort=-%cpu | head -5

SAR — HISTORICAL PERFORMANCE DATA

sar records system performance every 10 minutes and keeps 30 days of history. Like Windows Performance Monitor with automatic recording.

sar -u 1 5                      # CPU usage every 1 second, 5 readings
sar -r 1 5                      # memory usage every 1 second
sar -d 1 5                      # disk I/O every 1 second
sar -n DEV 1 5                  # network interface stats

# Historical data (yesterday, last 7 days):
sar -u -f /var/log/sysstat/sa$(date -d yesterday +%d)
# Shows CPU usage for every 10-minute interval yesterday

# Install if not present:
sudo apt install sysstat
sudo systemctl enable sysstat && sudo systemctl start sysstat

NETDATA — REAL-TIME DASHBOARD IN ONE COMMAND

Netdata is a real-time monitoring dashboard. Install it and get 2000+ metrics with beautiful graphs, accessible from your browser.

curl https://my-netdata.io/kickstart.sh > /tmp/netdata-kickstart.sh
sudo bash /tmp/netdata-kickstart.sh

# Opens at http://YOUR_SERVER_IP:19999
# Shows: CPU, memory, disk, network, processes, containers — everything
# Free, open source, self-hosted

PROMETHEUS + GRAFANA — PRODUCTION MONITORING STACK

For multi-server monitoring, Prometheus scrapes metrics and Grafana visualises them. This is what most production teams use.

# Quick start with Docker Compose:
mkdir monitoring && cd monitoring
cat > docker-compose.yml << 'EOF'
version: '3'
services:
  prometheus:
    image: prom/prometheus
    ports: ["9090:9090"]
    volumes: ["./prometheus.yml:/etc/prometheus/prometheus.yml"]
  grafana:
    image: grafana/grafana
    ports: ["3000:3000"]
    environment: [GF_SECURITY_ADMIN_PASSWORD=admin]
EOF

cat > prometheus.yml << 'EOF'
scrape_configs:
  - job_name: 'linux'
    static_configs:
      - targets: ['node-exporter:9100']
EOF

docker compose up -d

# Install node_exporter on servers you want to monitor:
wget https://github.com/prometheus/node_exporter/releases/latest/download/node_exporter-linux-amd64.tar.gz
tar xvf node_exporter-linux-amd64.tar.gz
sudo mv node_exporter*/node_exporter /usr/local/bin/
node_exporter &

# Open Grafana at http://localhost:3000
# Add Prometheus as data source (http://prometheus:9090)
# Import dashboard ID 1860 — instant Linux metrics dashboard

UPTIME KUMA — SIMPLE UPTIME MONITORING WITH ALERTS

For monitoring whether services are up and alerting via Telegram, WhatsApp, or email:

docker run -d --restart=always -p 3001:3001 \
    -v uptime-kuma:/app/data \
    --name uptime-kuma louislam/uptime-kuma:1

# Open http://YOUR_SERVER_IP:3001
# Add monitors for your HTTP endpoints, TCP ports, ping checks
# Configure alerts: Telegram bot works great for Indian teams

CUSTOM METRICS SCRIPT — MONITOR WHAT MATTERS TO YOU

#!/bin/bash
# /usr/local/bin/health-check — run every 5 minutes via cron

TELEGRAM_TOKEN="your_bot_token"
CHAT_ID="your_chat_id"
HOST=$(hostname)

alert() {
    curl -s -X POST "https://api.telegram.org/bot${TELEGRAM_TOKEN}/sendMessage" \
        -d chat_id="${CHAT_ID}" \
        -d text="⚠️ ${HOST}: $1" > /dev/null
}

# Check disk
DISK=$(df / | awk 'NR==2{print $5}' | tr -d '%')
[ "$DISK" -gt 85 ] && alert "Disk at ${DISK}%"

# Check memory
MEM_FREE=$(free | awk 'NR==2{printf "%.0f", $4/$2*100}')
[ "$MEM_FREE" -lt 10 ] && alert "Memory only ${MEM_FREE}% free"

# Check services
for SVC in nginx myapp postgresql; do
    systemctl is-active --quiet $SVC || alert "$SVC is DOWN"
done

# Add to cron:
# */5 * * * * /usr/local/bin/health-check

LOG MONITORING — ALERTING ON ERRORS

# Watch a log for errors and send a Telegram alert:
tail -f /opt/app/logs/app.log | while read line; do
    echo "$line" | grep -q "CRITICAL\|OutOfMemoryError\|FATAL" && \
    curl -s "https://api.telegram.org/bot${TOKEN}/sendMessage" \
        -d chat_id="${CHAT_ID}" -d text="🚨 ${line}" > /dev/null
done

Vijay's monitoring setup after 2 weeks: Netdata on every server for real-time visibility, Uptime Kuma for endpoint monitoring with Telegram alerts, sar for historical data, and a custom cron script for disk and service alerts. Total cost: zero rupees.

Key takeaways

sar records system metrics every 10 minutes automatically — query historical CPU/memory/disk data without setting anything up

Netdata gives a real-time browser dashboard with 2000+ metrics from a single install command

Uptime Kuma monitors endpoints and sends Telegram/WhatsApp alerts — free self-hosted equivalent of PagerDuty for small teams

The 60-second paste: uptime && free -h && df -h && ps aux --sort=-%cpu | head -5 — complete system snapshot

A cron script checking disk, memory, and services every 5 minutes covers 90% of production alerts needed

Commands from this chapter

$ sar -u 1 5

CPU usage every 1 second for 5 readings — like Performance Monitor

$ sar -u -f /var/log/sysstat/sa$(date -d yesterday +%d)

Yesterday's CPU history — built-in with no setup

$ curl https://my-netdata.io/kickstart.sh | sudo bash

Install Netdata real-time monitoring dashboard

$ docker run -d -p 3001:3001 louislam/uptime-kuma:1

Start Uptime Kuma for endpoint monitoring with alerts

$ uptime && free -h && df -h && ps aux --sort=-%cpu | head -5

60-second full system health snapshot

$ journalctl --since '1 hour ago' -p err | wc -l

Count errors across all services in the last hour