System Monitoring and Scheduled Tasks

Summary: in this tutorial, you will learn monitor system resources with top, htop, vmstat, and iostat. schedule automated tasks with cron and crontab.

System Monitoring and Scheduled Tasks

Beyond managing individual processes, you need to monitor overall system health and schedule tasks to run automatically. This tutorial covers the tools for real-time system monitoring and the cron scheduling system that keeps servers running smoothly.

System Monitoring

Memory: free

# Show memory usage
free
# Output in kilobytes (default)
 
# Human-readable format
free -h
#               total        used        free      shared  buff/cache   available
# Mem:           16Gi        5.2Gi       3.1Gi       512Mi        7.7Gi        10Gi
# Swap:          2.0Gi          0B        2.0Gi
 

Understanding the output:

ColumnMeaningExample
totalTotal physical memory installed16 GB
usedMemory actively in use by applications5.2 GB
freeCompletely unused memory3.1 GB
sharedMemory shared between processes (tmpfs)512 MB
buff/cacheMemory used for disk buffers and cache (can be freed)7.7 GB
availableMemory available for new apps (free + reclaimable cache)10 GB

Most important column: available

This is how much memory you can actually use for new programs. Linux uses "free" memory for caching files to speed up disk access. When you need memory, the cache is automatically freed.

Don't worry if "free" is low—check "available" instead!

Disk: df and du

# Filesystem usage
df -h
# Filesystem      Size  Used Avail Use% Mounted on
# /dev/sda1        50G   35G   13G  74% /
# /dev/sdb1       100G   60G   36G  63% /home
 
# Inode usage (sometimes you run out of inodes before space)
df -hi
 
# Specific filesystem
df -h /home
 
# Disk usage by directory
du -sh /var/log/*
# 128M    /var/log/apache2
# 45M     /var/log/nginx
# 512K    /var/log/mysql
 
# Top 10 largest directories
du -h /home/alice | sort -rh | head -10
 
# Disk I/O statistics
iostat
iostat -x 2 5    # Extended stats, update every 2 seconds, 5 times
 
# What processes are using I/O
iotop            # Requires root
 

CPU: mpstat and vmstat

# Per-CPU statistics
mpstat -P ALL 2  # All CPUs, update every 2 seconds
 
# System performance overview
vmstat 2         # Update every 2 seconds
# procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
#  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
#  1  0      0 3219456 523456 7823456  0    0    12    45  234  456  3  2 95  0  0
 
# Key vmstat columns:
# r = processes waiting for CPU (runqueue)
# b = processes blocked on I/O
# si/so = swap in/out (high = RAM pressure)
# bi/bo = blocks in/out (disk activity)
# wa = CPU waiting for I/O (high = disk bottleneck)
 

Network: ss and netstat

# Modern tool: ss (socket statistics)
ss -tuln
# -t = TCP, -u = UDP, -l = listening, -n = numeric (no DNS lookup)
 
# Show process names (requires root)
sudo ss -tulnp
# tcp   LISTEN 0      128          0.0.0.0:22        0.0.0.0:*    users:(("sshd",pid=890,fd=3))
# tcp   LISTEN 0      128          0.0.0.0:80        0.0.0.0:*    users:(("nginx",pid=1234,fd=6))
 
# All connections (not just listening)
ss -tan
# -a = all (listening and established)
 
# Show specific port
ss -tulnp | grep :80
ss -tulnp '( dport = :80 )'    # ss syntax for filtering
 
# Count connections per state
ss -tan | awk 'NR>1 {print $1}' | sort | uniq -c | sort -rn
#     150 ESTAB
#      12 TIME-WAIT
#       8 LISTEN
 
# Older tool: netstat (being replaced by ss)
netstat -tuln        # Same as ss -tuln
netstat -tulnp       # With process names
 

Load Average and Uptime

uptime
# 10:30:00 up 15 days,  3:45,  2 users,  load average: 0.52, 0.45, 0.38
#          \_________/  \____/  \_______/  \_________________________/
#          system time   uptime  logged in  load: 1min, 5min, 15min
 
# What is load average?
# - Number of processes waiting for CPU time
# - Includes both running and runnable processes
# - Also includes processes in uninterruptible sleep (disk I/O wait)
 
# Interpreting load:
# - Load < number of CPUs = system has capacity
# - Load = number of CPUs = system fully utilized
# - Load > number of CPUs = system overloaded (processes waiting)
 
# Check CPU count
nproc           # Number of processing units
# 4
 
lscpu           # Detailed CPU info
# Architecture:            x86_64
# CPU(s):                  4
# Thread(s) per core:      2
# Core(s) per socket:      2
 
# Rule of thumb:
# 4-core system:
# - Load 2.0 = 50% utilization (good)
# - Load 4.0 = 100% utilization (busy but OK)
# - Load 8.0 = 200% overload (2x more work than can be processed)
 

Scheduled Tasks with Cron

Cron runs commands on a schedule—essential for automation, backups, maintenance, monitoring.

Why cron matters:

  • Automation: Run scripts without human intervention
  • Backups: Automated nightly backups
  • Maintenance: Clean temp files, rotate logs, update databases
  • Monitoring: Regular health checks, send alerts
  • Reporting: Generate daily/weekly reports

Crontab Format


 ┌────────── minute (0-59)
 │ ┌────────── hour (0-23)
 │ │ ┌────────── day of month (1-31)
 │ │ │ ┌────────── month (1-12)
 │ │ │ │ ┌────────── day of week (0-7, Sun=0 or 7, Mon=1, ...)
 │ │ │ │ │
 * * * * * command to execute

Special characters:

  • * = any value
  • , = list (1,3,5 = 1st, 3rd, 5th)
  • - = range (1-5 = 1, 2, 3, 4, 5)
  • / = step (*/5 = every 5)

Common Cron Schedules

WhenCron ExpressionExplanation
Every minute* * * * *Minute, hour, day, month, weekday all = any
Every 5 minutes*/5 * * * *Every 5th minute (0, 5, 10, 15, ...)
Every 15 minutes*/15 * * * *0, 15, 30, 45 past the hour
Every hour0 * * * *At minute 0 (top of every hour)
Every 6 hours0 */6 * * *At 0:00, 6:00, 12:00, 18:00
Daily at 2 AM0 2 * * *2:00 AM every day
Daily at midnight0 0 * * *12:00 AM every day
Every Monday at 9 AM0 9 * * 19:00 AM on Mondays (1=Mon)
First of every month0 0 1 * *Midnight on the 1st
Every weekday at 8 AM0 8 * * 1-5Monday-Friday at 8:00 AM
Twice daily (8 AM, 8 PM)0 8,20 * * *8:00 and 20:00 every day
Every 30 min, business hours*/30 9-17 * * 1-59:00-17:30, Mon-Fri, every 30 min
Quarterly (Jan 1, Apr 1, ...)0 0 1 1,4,7,10 *Midnight on 1st of Jan, Apr, Jul, Oct

Managing Your Crontab

# Edit your crontab
crontab -e
# Opens your crontab in your default editor (usually vi or nano)
# First time: may ask you to choose an editor
 
# List your crontab
crontab -l
 
# Remove your crontab (CAREFUL!)
crontab -r
# Deletes entire crontab with no confirmation!
 
# Remove with prompt
crontab -ri    # -i = interactive, asks for confirmation
 
# Edit another user's crontab (requires root)
sudo crontab -u alice -e
sudo crontab -u alice -l
 

Cron Job Examples

# Edit your crontab
crontab -e
 
# Add these lines:
 
# Backup home directory every night at 2 AM
0 2 * * * tar czf /backup/home_$(date +\%Y\%m\%d).tar.gz /home/alice
 
# Clear old temp files every Sunday at midnight
0 0 * * 0 find /tmp -type f -mtime +7 -delete
 
# Check disk space every hour, alert if over 90%
0 * * * * /home/alice/scripts/check_disk.sh >> /home/alice/logs/disk.log 2>&1
 
# Database backup every day at 3 AM
0 3 * * * /home/alice/scripts/backup_db.sh
 
# Rotate logs every day at midnight
0 0 * * * /home/alice/scripts/rotate_logs.sh >> /var/log/rotation.log 2>&1
 
# Health check every 5 minutes
*/5 * * * * curl -sf http://localhost:8080/health || echo "$(date): Site down!" | mail -s "Alert" admin@example.com
 
# Generate weekly report every Monday at 8 AM
0 8 * * 1 /home/alice/scripts/weekly_report.sh
 
# Clean up old backups (keep 30 days) every day at 4 AM
0 4 * * * find /backup -name "*.tar.gz" -mtime +30 -delete
 
# Update system cache every 6 hours
0 */6 * * * /usr/bin/update-cache.sh
 
# Send reminder email every Friday at 5 PM
0 17 * * 5 echo "Don't forget to submit timesheet!" | mail -s "Reminder" alice@example.com
 

Cron Environment and Best Practices

💡 Cron environment is minimal

Important: Cron runs with a minimal environment—no .bashrc, limited PATH, no aliases.

Best practices:

  1. Use absolute paths for commands and files:
    # BAD:
    0 2 * * * backup.sh
     
    # GOOD:
    0 2 * * * /home/alice/scripts/backup.sh
  2. Set PATH at top of crontab if needed:
    PATH=/usr/local/bin:/usr/bin:/bin
    0 2 * * * backup.sh    # Now it can find backup.sh
  3. Redirect output to log files (or mail):
    # Silent (no output):
    0 2 * * * /home/alice/scripts/backup.sh > /dev/null 2>&1
     
    # Log output:
    0 2 * * * /home/alice/scripts/backup.sh >> /var/log/backup.log 2>&1
  4. Test commands manually before adding to cron:
    # Run your command in a minimal environment to simulate cron:
    env -i /bin/bash --noprofile --norc -c '/home/alice/scripts/backup.sh'
  5. Check cron logs for debugging:
    # Ubuntu/Debian:
    grep CRON /var/log/syslog
     
    # CentOS/RHEL:
    grep CRON /var/log/cron
     
    # View with journalctl (systemd):
    journalctl -u cron
  6. Set MAILTO for error notifications:
    MAILTO=alice@example.com
    0 2 * * * /home/alice/scripts/backup.sh
    # If backup.sh produces any output, it's emailed to alice

Testing Cron Jobs

# Run a test cron every minute to verify cron is working
* * * * * echo "Cron works: $(date)" >> /tmp/cron_test.log
 
# Wait 2 minutes, then check:
cat /tmp/cron_test.log
# Should see entries for each minute
 
# Remove test when done:
crontab -e  # Delete the test line
rm /tmp/cron_test.log
 

systemd Timers — Modern Alternative

On modern Linux systems (systemd), timers are an alternative to cron:

# List all active timers
systemctl list-timers
 
# Example output:
# NEXT                         LEFT          LAST                         PASSED  UNIT
# Mon 2026-02-11 00:00:00 UTC  3h 45min left Sun 2026-02-10 00:00:00 UTC  20h ago apt-daily.timer
# Mon 2026-02-11 06:00:00 UTC  9h left       Sun 2026-02-10 06:00:00 UTC  14h ago backup.timer
 
# Check a specific timer
systemctl status backup.timer
 
# Timer advantages over cron:
# - Better logging (journalctl)
# - Can depend on other services
# - Persistent across reboots (can catch missed runs)
# - More flexible scheduling
 

Exercises

🏋️ Exercise 1: Process Viewing and Management

Task 1: Find the 5 processes using the most memory. Show PID, user, memory percentage, and command.

Task 2: Find all processes owned by your user that are currently sleeping.

Task 3: Start a sleep 300 process in the background, then send it a SIGTERM signal (graceful termination).

Show Solution
# Task 1: Top 5 memory consumers
ps aux --sort=-%mem | head -6
# Or with specific columns:
ps -eo pid,user,%mem,cmd --sort=-%mem | head -6
 
# Task 2: Your sleeping processes
ps -u $USER | grep " S "
# Or more specifically:
ps -u $(whoami) -o pid,stat,cmd | grep "^S"
 
# Task 3: Background sleep and terminate
sleep 300 &
# Note the PID (e.g., 12345)
# Or find it:
pgrep -f "sleep 300"
# Kill gracefully:
kill $(pgrep -f "sleep 300")
# Verify it's gone:
jobs
pgrep -f "sleep 300"  # Should return nothing
 
🏋️ Exercise 2: Signal Handling Script

Task: Write a script that:

  1. Creates a temporary directory on start
  2. Writes "Working..." to a file in that directory every second
  3. Uses trap to clean up the temporary directory when the script exits (normally, Ctrl+C, or SIGTERM)
  4. Test it by running it, then interrupting with Ctrl+C
Show Solution
#!/bin/bash
# signal_test.sh
 
TMPDIR=$(mktemp -d)
COUNTER=0
 
echo "Created temp directory: $TMPDIR"
echo "PID: $$"
echo "Press Ctrl+C to stop..."
 
# Cleanup function
cleanup() {
    echo ""
    echo "Cleaning up..."
    echo "Processed $COUNTER iterations"
    rm -rf "$TMPDIR"
    echo "Removed $TMPDIR"
    exit 0
}
 
# Trap signals
trap cleanup EXIT INT TERM
 
# Main loop
while true; do
    ((COUNTER++))
    echo "Working... iteration $COUNTER" | tee -a "$TMPDIR/work.log"
    sleep 1
done
 
# Usage:
# chmod +x signal_test.sh
# ./signal_test.sh
# (Press Ctrl+C after a few seconds)
 
🏋️ Exercise 3: Cron Schedule Challenge

Task 1: Write a cron expression that runs a backup script every weekday (Monday-Friday) at 11:30 PM.

Task 2: Write a cron expression that runs a cleanup script every 15 minutes, but only between 9 AM and 5 PM.

Task 3: What does this cron expression do? 0 */4 1,15 * *

Task 4: Write a complete cron job that checks if a website is up every 5 minutes and logs the result.

Show Solution
# Task 1: Weekday backup at 11:30 PM
30 23 * * 1-5 /home/user/scripts/backup.sh
 
# Explanation:
# 30 = minute 30
# 23 = hour 23 (11 PM)
# * = any day of month
# * = any month
# 1-5 = Monday through Friday
 
# Task 2: Cleanup every 15 minutes, 9 AM-5 PM
*/15 9-17 * * * /home/user/scripts/cleanup.sh
 
# Explanation:
# */15 = every 15 minutes (0, 15, 30, 45)
# 9-17 = hours 9 through 17 (9 AM through 5 PM)
# Runs at 9:00, 9:15, 9:30, ..., 17:00, 17:15, 17:30, 17:45
 
# Task 3: What does this do?
0 */4 1,15 * *
# Answer: Runs every 4 hours (0:00, 4:00, 8:00, 12:00, 16:00, 20:00)
#         on the 1st and 15th of every month
# Use case: Semi-monthly reports or checks
 
# Task 4: Website uptime monitor
*/5 * * * * /usr/bin/curl -sf https://example.com -o /dev/null && echo "$(date): UP" >> /home/user/uptime.log || echo "$(date): DOWN" >> /home/user/uptime.log 2>&1
 
# Or better, as a script:
# check_website.sh:
#!/bin/bash
LOGFILE="$HOME/website_uptime.log"
if curl -sf https://example.com -o /dev/null; then
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] Website is UP" >> "$LOGFILE"
else
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] Website is DOWN!" >> "$LOGFILE"
    # Optional: send alert
    # echo "Website down!" | mail -s "Alert" admin@example.com
fi
 
# Then in crontab:
*/5 * * * * /home/user/scripts/check_website.sh
 
🏋️ Exercise 4: Complete Process Management Workflow

Task: You notice your server is running slow. Walk through the diagnostic process:

  1. Check load average and interpret it
  2. Find the top 3 CPU-consuming processes
  3. Find the top 3 memory-consuming processes
  4. Check if any processes are in uninterruptible sleep (state D)
  5. If you find a misbehaving process (let's say PID 12345), demonstrate the proper way to terminate it
Show Solution
# Step 1: Check load average
uptime
# Output: 10:30:00 up 15 days, 3:45, 2 users, load average: 8.52, 6.45, 4.38
 
# Interpretation:
nproc  # Check CPU count, e.g., 4
# Load 8.52 on a 4-core system = 213% utilization = OVERLOADED
# Load is increasing (4.38 → 6.45 → 8.52) = getting worse
 
# Step 2: Top 3 CPU consumers
ps aux --sort=-%cpu | head -4
# USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
# root         1  0.0  0.1 169356 13296 ?        Ss   Jan10   0:05 /sbin/init
# alice    12345 95.2  1.2 345678 98765 ?        R    10:25  15:30 python rogue_script.py
# bob      23456 45.3  2.1 234567 87654 ?        R    09:00   5:15 node server.js
# carol    34567 12.1  0.8 123456 65432 ?        S    08:30   2:45 ruby app.rb
 
# Step 3: Top 3 memory consumers
ps aux --sort=-%mem | head -4
# alice    12345  2.5 15.2 2345678 1234567 ?     R    10:25  15:30 python rogue_script.py
# dave     45678  1.2 12.3 1234567  987654 ?     S    07:00   3:21 java -jar app.jar
# eve      56789  0.8  8.5  987654  654321 ?     S    06:00   1:45 mysql
 
# Step 4: Check for uninterruptible sleep (D state)
ps aux | awk '$8 ~ /D/ {print $0}'
# If any processes show up, they're stuck waiting for I/O
# This might indicate disk problems or network issues
 
# Step 5: Properly terminate PID 12345
# First, investigate what it is:
ps -p 12345 -o pid,user,%cpu,%mem,cmd
# Try graceful termination:
kill 12345
echo "Sent SIGTERM, waiting 10 seconds..."
sleep 10
 
# Check if it's still running:
if ps -p 12345 > /dev/null 2>&1; then
    echo "Process still running, trying SIGTERM again..."
    kill 12345
    sleep 5
 
    if ps -p 12345 > /dev/null 2>&1; then
        echo "Process not responding, force killing..."
        kill -9 12345
        echo "Process force-terminated"
    else
        echo "Process terminated on second SIGTERM"
    fi
else
    echo "Process terminated gracefully"
fi
 
# Comprehensive diagnostic script:
#!/bin/bash
echo "=== System Diagnostic ==="
echo ""
echo "Load Average:"
uptime
echo ""
echo "CPU Count:"
nproc
echo ""
echo "Top 5 CPU Consumers:"
ps aux --sort=-%cpu | head -6
echo ""
echo "Top 5 Memory Consumers:"
ps aux --sort=-%mem | head -6
echo ""
echo "Processes in Uninterruptible Sleep (D state):"
ps aux | awk '$8 ~ /D/'
echo ""
echo "Memory Status:"
free -h
echo ""
echo "Disk Usage:"
df -h
 

Summary

You now understand Linux process management:

Viewing Processes:

  • ps aux — snapshot of all processes
  • top / htop — real-time monitoring
  • pgrep — find processes by name
  • Process states: R (running), S (sleeping), D (disk wait), Z (zombie)

Job Control:

  • command & — run in background
  • Ctrl+Z — suspend foreground job
  • jobs — list background jobs
  • fg / bg — move between foreground/background
  • nohup / disown — survive logout

Signals:

  • kill PID — send SIGTERM (graceful)
  • kill -9 PID — send SIGKILL (force)
  • killall / pkill — kill by name/pattern
  • trap — catch signals in scripts

Monitoring:

  • free -h — memory usage
  • df -h — disk usage
  • uptime — load average
  • ss -tulnp — network connections

Scheduling:

  • crontab -e — edit scheduled tasks
  • Format: minute hour day month weekday command
  • Always use absolute paths in cron
  • Redirect output to logs

Next steps: Learn advanced scripting techniques including error handling, debugging, and production-ready script patterns.

Was this page helpful?
SR

Written by the ShellRAG Team

The ShellRAG editorial team writes practical, beginner-friendly Bash Shell tutorials with tested code examples and real-world use cases. Every article is technically reviewed for accuracy and updated regularly.

Learn more about us →