Learn 🧠 All Concepts (20) 🤖 What is an LLM? 📚 RAG Explained ⚡ AI Agents 💻 Run AI Locally 🇮🇳 AI in India 📖 Learn Tracks 🔧 DevOps Track ⚙️ AI Ops Track 🗺️ AI Engineer Roadmap
Tools 🔧 AI Tools Directory 🔓 Open Source AI ⭐ Top GitHub Repos ✦ Claude Skill Repos 🚀 Ready-to-Deploy Projects
Build 🏗️ Build Hub 🎯 Master Prompts 🧩 RAG Agents 🚀 App Megaprompts
Workflows ⚡ All Workflows (22) 🎥 Text to Video 🎞️ Image to Video 🔊 Text to Speech ♻️ Automation
Resources 🧪 Colab Notebooks ⚙️ n8n Workflows 📈 Algo Trading 💰 Passive Income
🗂️ Browse All Topics About AItheGuru
Learn Linux for Production Support Rajan unlocks the kernel's hidden performance
Linux for Production Support Ch 27 / 32 Advanced
⚙️

Rajan unlocks the kernel's hidden performance

sysctl, ulimits, file descriptors — tuning the OS for high-traffic production

⏱ 13 min 6 commands 5 takeaways
⚙️
In this chapter
Rajan
Java developer turned infrastructure engineer
The story

Rajan was a Java developer turned infrastructure engineer. His apps worked fine on small servers. When the company scaled to high-traffic, strange things happened. Connections were refused at 1000 concurrent users even though CPU was 20%. Database connections were timing out even though the database was healthy. The OS was the bottleneck, not the app.

His senior SRE spent 30 minutes with him, changed 8 kernel parameters, and the server handled 10x the traffic without hardware changes.

WHAT IS THE KERNEL AND WHY DOES IT MATTER

The Linux kernel is the core of the OS. It manages memory, CPU scheduling, network connections, and file handles. By default its settings are tuned for general purpose workloads. Production servers serving high traffic need tuning.

sysctl is the tool to read and change kernel parameters:

sysctl -a                           # show all kernel parameters (thousands of them)
sysctl net.core.somaxconn           # read a specific parameter
sysctl -w net.core.somaxconn=1024   # change a parameter (temporary, resets on reboot)
# Permanent changes in /etc/sysctl.conf or /etc/sysctl.d/:
echo "net.core.somaxconn = 65535" | sudo tee -a /etc/sysctl.d/99-production.conf
sudo sysctl -p /etc/sysctl.d/99-production.conf   # apply without reboot

NETWORK TUNING — HIGH TRAFFIC SERVERS

# /etc/sysctl.d/99-network.conf
# Maximum pending connections in the kernel queue (default 128, too low for high traffic):
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
# TIME_WAIT connection recycling (default waits 60 seconds, can exhaust ports):
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15
# Increase local port range (default 32768-60999 = only 28000 ports):
net.ipv4.ip_local_port_range = 10000 65535
# TCP buffer sizes for high-bandwidth connections:
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864
# Enable TCP Fast Open (reduces latency for repeated connections):
net.ipv4.tcp_fastopen = 3

MEMORY TUNING

# /etc/sysctl.d/99-memory.conf
# Reduce swap usage (default 60 is too high for servers with enough RAM):
vm.swappiness = 10
# How aggressively kernel reclaims memory from inode/dentry cache:
vm.vfs_cache_pressure = 50
# Overcommit settings (for JVM applications):
vm.overcommit_memory = 1

FILE DESCRIPTORS — ULIMITS

Every open file and network connection uses a file descriptor. The default limit is 1024. A busy server handling 10,000 connections needs far more.

ulimit -n                       # current max open files for your session
ulimit -n 65535                 # increase for current session only
# Check what a running process has open:
cat /proc/$(pgrep java)/limits | grep "Max open files"
lsof -p $(pgrep java) | wc -l   # how many are actually open?
# Permanent system-wide limits in /etc/security/limits.conf:
sudo nano /etc/security/limits.conf
# Add these lines:
*       soft    nofile  65535
*       hard    nofile  65535
tomcat  soft    nofile  65535
tomcat  hard    nofile  65535
# For systemd services, set in the service file:
[Service]
LimitNOFILE=65535

CHECKING CURRENT KERNEL BOTTLENECKS

# Are we hitting max connections?
ss -s | grep "TCP:"
cat /proc/sys/net/core/somaxconn
# Are we running out of file descriptors system-wide?
cat /proc/sys/fs/file-nr
# Output: used  unused  max
# If used is close to max, you need to increase fs.file-max
# Are TIME_WAIT connections piling up?
ss -tan state time-wait | wc -l
# Over 10000 is a problem
# Are we dropping incoming connections?
netstat -s | grep "SYNs to LISTEN"
# Non-zero means somaxconn is too low

CPU SCHEDULER TUNING

# Check CPU scheduling policy:
chrt -p $(pgrep java)       # what scheduling policy is this process using?
# For latency-sensitive processes (real-time scheduling):
sudo chrt -f -p 50 $(pgrep java)    # FIFO scheduling, priority 50
# NUMA topology (for multi-socket servers):
numactl --hardware           # see NUMA nodes
numactl --cpunodebind=0 --membind=0 java -jar app.jar   # pin to NUMA node 0

Rajan's server after tuning handled 12,000 concurrent connections on the same hardware that choked at 1,000. The CPU barely moved. The kernel was the bottleneck. Kernel settings are free performance.

Key takeaways

net.core.somaxconn limits pending connections in the kernel queue — default 128 is far too low for any production server

ulimit -n shows the max open files limit — a Java app handling 10k connections needs this at 65535 not 1024

vm.swappiness = 10 reduces swap usage on servers with enough RAM — the default 60 causes unnecessary swapping

net.ipv4.tcp_tw_reuse = 1 recycles TIME_WAIT connections — prevents port exhaustion under high traffic

sysctl -p applies changes from a config file without rebooting — always verify with sysctl -a | grep setting

Commands from this chapter
$ sysctl -a | grep somaxconn
Check current max pending connection queue size
$ sudo sysctl -w net.core.somaxconn=65535
Increase connection queue (temporary — resets on reboot)
$ echo 'net.core.somaxconn=65535' | sudo tee -a /etc/sysctl.d/99-prod.conf && sudo sysctl -p /etc/sysctl.d/99-prod.conf
Make sysctl setting permanent
$ ulimit -n
Check max open files for current session
$ cat /proc/$(pgrep java)/limits | grep 'Max open files'
Check file descriptor limit for running Java process
$ ss -tan state time-wait | wc -l
Count TIME_WAIT connections — high number means port exhaustion risk