Process Management

Summary: in this tutorial, you will learn understand linux processes, view and manage running processes, control jobs, and send signals.

Process Management

Every command you run—from ls to launching a web server—creates a process: an instance of a running program. Mastering process management is fundamental to system administration, debugging, and automation.

Why process management matters:

  • System stability: Identify and stop misbehaving programs
  • Performance tuning: Find resource hogs and optimize
  • Debugging: Understand what's running and why
  • Automation: Schedule tasks and manage long-running services
  • Job control: Run multiple tasks simultaneously

This tutorial covers viewing processes, job control, signals, system monitoring, and scheduled tasks with cron.

Understanding Processes

A process is a running instance of a program with its own:

  • PID (Process ID): Unique number identifying this process
  • PPID (Parent PID): The process that started this one
  • Owner: User who started it (determines permissions)
  • State: Running, sleeping, stopped, zombie
  • Priority: How much CPU time it gets
  • Memory: Address space with code, data, stack, heap
  • File descriptors: Open files, network connections, pipes

Process hierarchy:

Every process (except PID 1) has a parent. This creates a tree:

systemd (PID 1) — the init process, parent of everything
├── bash (PID 1234) — your shell
│   ├── ls (PID 1235) — command you ran
│   └── grep (PID 1236) — pipeline command
├── sshd (PID 567) — SSH server
│   └── sshd (PID 1240) — your SSH session
│       └── bash (PID 1241) — your login shell
└── nginx (PID 890) — web server
    ├── nginx (PID 891) — worker process
    └── nginx (PID 892) — worker process

When a parent process dies, its children are orphaned and adopted by PID 1 (systemd or init).

Why understanding processes matters:

  • Debugging: "Why is my script hanging?" → Check child processes
  • Resource usage: "Why is the server slow?" → Find CPU/memory hogs
  • Security: "What's running on my system?" → Audit processes
  • Cleanup: Ensure background jobs don't outlive their purpose

Viewing Processes

ps — Process Snapshot

ps shows a snapshot of current processes (not real-time):

# Your processes only (minimal info)
ps
# Output:
#   PID TTY          TIME CMD
#  1234 pts/0    00:00:00 bash
#  1235 pts/0    00:00:00 ps
 
# All processes, BSD-style (most common on Linux)
ps aux
# 'a' = all users, 'u' = user-oriented format, 'x' = include processes without TTY
 
# All processes, UNIX-style
ps -ef
# '-e' = all processes, '-f' = full format
 
# Why two styles? Historical reasons (BSD vs. System V Unix)
# Use whichever you prefer; ps aux is more common on Linux
 

Understanding ps aux Output


USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.1 169356 13296 ?        Ss   Jan10   0:05 /sbin/init
alice     1234  2.5  1.2 345678 98765 pts/0    S+   10:30   0:15 python app.py
bob       5678  0.1  0.5 123456 45678 ?        Ss   09:00   0:02 /usr/sbin/sshd

ColumnDescriptionExample
USERProcess owneralice
PIDProcess ID1234
%CPUCPU usage percentage2.5%
%MEMMemory usage percentage1.2%
VSZVirtual memory size (KB) — total allocated345678 KB
RSSResident Set Size (KB) — actual memory in RAM98765 KB
TTYTerminal (? = no terminal/daemon)pts/0
STATProcess state (see below)S+
STARTWhen the process started10:30
TIMETotal CPU time consumed0:15
COMMANDThe command that started the processpython app.py

VSZ vs RSS:

  • VSZ (Virtual Size): Total memory allocated (includes shared libraries, mapped files)
  • RSS (Resident Set Size): Actual physical RAM in use
  • RSS is more important—it's what actually impacts available memory

Process States (STAT Column)

StateMeaningWhen It Happens
RRunning or runnableActively using CPU or waiting for CPU time
SSleeping (interruptible)Waiting for input/event (can be woken by signal)
DSleeping (uninterruptible)Usually waiting for I/O (can't be interrupted)
TStoppedPaused by signal (Ctrl+Z) or debugger
ZZombieFinished but parent hasn't read its exit status yet
<High priorityNice value < 0 (gets more CPU)
NLow priorityNice value > 0 (gets less CPU)
sSession leaderLeader of a process group
lMulti-threadedHas multiple threads
+Foreground process groupIn the foreground of its terminal

Common combinations:

  • S+ = Sleeping, in foreground (e.g., your shell waiting for input)
  • Ss = Sleeping, session leader (typical daemon)
  • R+ = Running in foreground
  • Z = Zombie (usually harmless, but many zombies indicate a bug)

Sorting and Filtering ps Output

# Sort by memory usage (top 10)
ps aux --sort=-%mem | head -11
# --sort=-%mem means sort by memory, descending (- = reverse)
 
# Sort by CPU usage
ps aux --sort=-%cpu | head -11
 
# Sort by process start time
ps aux --sort=start_time
 
# Multiple sorts (CPU, then memory)
ps aux --sort=-%cpu,-%mem | head -11
 
# Custom columns
ps -eo pid,ppid,user,%cpu,%mem,cmd --sort=-%cpu | head -15
# -e = all processes
# -o = output format (specify columns)
# Available columns: pid, ppid, user, %cpu, %mem, vsz, rss, tty, stat, start, time, cmd, and many more
 
# Find specific processes
ps aux | grep nginx
ps -C nginx                      # By command name
ps -u alice                      # By user
ps -p 1234,5678                  # By PID
 
# Process tree (parent-child relationships)
ps auxf                          # 'f' = forest (ASCII tree)
ps -ejH                          # UNIX-style tree
 

pstree — Visualize Process Hierarchy

# Show process tree
pstree
 
# With PIDs
pstree -p
 
# For a specific user
pstree alice
 
# For a specific process and its descendants
pstree -p 1234
 
# Example output:
# systemd─┬─ModemManager───2*[{ModemManager}]
#         ├─NetworkManager───2*[{NetworkManager}]
#         ├─accounts-daemon───2*[{accounts-daemon}]
#         ├─bash───pstree
#         ├─nginx───nginx
#         └─sshd───sshd───bash───python
 

top — Real-Time Process Monitor

top continuously updates to show process activity:

top
 
# Key information in top:
# - Load average (1, 5, 15 minute averages)
# - Tasks: running, sleeping, stopped, zombie
# - %CPU: user processes, system, idle, I/O wait
# - Memory: total, used, free, buffers, cache
# - Process list: sorted by CPU usage by default
 

top interactive commands:

KeyActionWhy Use It
qQuitExit top
MSort by memoryFind memory hogs
PSort by CPUFind CPU hogs (default)
TSort by timeFind long-running processes
kKill a processTerminate misbehaving process
rReniceChange process priority
1Toggle per-CPU viewSee load on each core
cToggle full commandSee complete command line
uFilter by userShow only one user's processes
fChoose display fieldsCustomize what columns show
EChange memory unitsSwitch between KB, MB, GB
h or ?HelpShow all commands

top header explained:


top - 10:30:15 up 15 days,  3:45,  2 users,  load average: 0.52, 0.45, 0.38
Tasks: 287 total,   1 running, 286 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.2 us,  1.5 sy,  0.0 ni, 94.8 id,  0.5 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  16384.0 total,   3218.5 free,   5342.2 used,   7823.3 buff/cache
MiB Swap:   2048.0 total,   2048.0 free,      0.0 used.  10127.8 avail Mem

  • Load average: 0.52, 0.45, 0.38 = 1-min, 5-min, 15-min averages
    • If load > number of CPU cores, system is overloaded
    • Check cores: nproc
  • CPU breakdown:
    • us (user): CPU time in user processes
    • sy (system): CPU time in kernel
    • id (idle): CPU doing nothing
    • wa (wait): Waiting for I/O (high = disk bottleneck)
  • Memory: avail Mem is most important—actual memory available for new programs

htop — Improved top

htop is a more user-friendly alternative to top:

# Install htop
sudo apt install htop    # Debian/Ubuntu
sudo yum install htop    # CentOS/RHEL
sudo dnf install htop    # Fedora
 
# Run htop
htop
 

htop advantages:

  • Color-coded bars for CPU, memory, swap
  • Mouse support (click to sort, select processes)
  • Tree view by default (F5)
  • Easier to kill/renice processes (F9/F7)
  • Search processes (F3)
  • Filter by user (F4)

pgrep and pkill — Find and Signal by Name

# Find PIDs by name
pgrep nginx
# Output: 890 891 892
 
# Show PID and name
pgrep -l nginx
# Output: 890 nginx
#         891 nginx
#         892 nginx
 
# Show PID and full command
pgrep -a nginx
# Output: 890 nginx: master process /usr/sbin/nginx
#         891 nginx: worker process
 
# Find by user
pgrep -u alice
pgrep -u alice python    # Alice's python processes
 
# Count matching processes
pgrep -c nginx
# Output: 3
 
# Most recent match
pgrep -n nginx
 
# Oldest match
pgrep -o nginx
 
# Exact match only
pgrep -x bash    # Matches "bash" but not "bash-script"
 
# Kill by name (careful!)
pkill nginx
pkill -u alice python    # Kill alice's python processes
pkill -f "python app.py" # Kill by full command line
 

Job Control

Job control lets you run multiple processes from one shell, switching between foreground and background.

Foreground vs Background

  • Foreground: Process takes over your terminal; you can't run other commands until it finishes
  • Background: Process runs independently; you can continue working
# Run in foreground (default)
sleep 60
# Terminal blocked for 60 seconds
 
# Run in background (add & at the end)
sleep 60 &
# [1] 12345              ← Job number 1, PID 12345
# Terminal immediately available
 
# Multiple background jobs
sleep 30 &    # Job 1
sleep 40 &    # Job 2
sleep 50 &    # Job 3
 

Managing Jobs

# List jobs
jobs
# [1]   Running                 sleep 30 &
# [2]-  Running                 sleep 40 &
# [3]+  Running                 sleep 50 &
# + = most recent job, - = second most recent
 
# List with PIDs
jobs -l
# [1]  12345 Running             sleep 30 &
# [2]  12346 Running             sleep 40 &
 
# Bring a background job to foreground
fg %1                    # By job number
fg                       # Most recent job (the + one)
 
# Suspend a foreground process (pause it)
# Run: sleep 60
# Press: Ctrl+Z
# Output: [1]+  Stopped     sleep 60
 
# Resume a stopped job in the background
bg %1
# [1]+ sleep 60 &
 
# Resume in foreground
fg %1
 

Practical workflow:

# Start editing a file
vim large_file.txt
 
# Realize you need to run a command
# Press Ctrl+Z
# [1]+  Stopped                 vim large_file.txt
 
# Run your command
ls -l
 
# Return to editing
fg
 

Long-Running Background Jobs

# Run a command, then decide to background it
find / -name "*.log" 2>/dev/null    # Oops, this is slow!
# Press Ctrl+Z to suspend
# [1]+  Stopped     find / -name "*.log" 2>/dev/null
bg
# Now it runs in background while you continue working
 
# Problem: If you log out, background jobs are killed!
# Solution: nohup or disown
 

nohup — Survive Logout

When you close your terminal, all your processes receive SIGHUP (hangup) and die. nohup (no hangup) prevents this:

# Run a command that survives logout
nohup ./long_running_script.sh &
# Output appended to nohup.out by default
 
# Redirect output to a specific file
nohup ./script.sh > output.log 2>&1 &
# > output.log = redirect stdout
# 2>&1 = redirect stderr to same file as stdout
# & = run in background
 
# Check if it's running after logout/login
ps aux | grep script.sh
 

disown — Detach from Shell

disown removes a job from the shell's job table, preventing SIGHUP:

# Start a job
./long_script.sh &
# [1] 12345
 
# Disown it (shell forgets about it)
disown %1
 
# Or disown most recent job
disown
 
# Now you can close the terminal without killing the process
 
# Verify it's still running
ps -p 12345
 

nohup vs disown:

  • nohup: Run from the start with hangup immunity
  • disown: Remove an already-running job from shell control

Signals

Signals are messages sent to processes to control behavior. Think of them as inter-process "push notifications."

Why signals matter:

  • Terminate processes: Stop misbehaving programs
  • Reload configuration: Tell services to re-read config files without restarting
  • Debugging: Trigger core dumps for analysis
  • Process control: Pause, resume, or notify processes

Common Signals

SignalNumberShortcutDefault ActionDescriptionUse Case
SIGHUP1-TerminateHangup (terminal closed)Reload config
SIGINT2Ctrl+CTerminateInterruptStop a command
SIGQUIT3Ctrl+\Core dumpQuit with core dumpDebug crash
SIGKILL9-TerminateKill (cannot be caught!)Force kill
SIGTERM15-TerminatePolite termination requestGraceful shutdown
SIGSTOP19Ctrl+ZStopPause process (cannot be caught!)-
SIGCONT18-ContinueResume a stopped processResume after Ctrl+Z
SIGUSR110-TerminateUser-defined signal 1Custom use
SIGUSR212-TerminateUser-defined signal 2Custom use

Key differences:

AspectSIGTERM (15)SIGKILL (9)
Can be caughtYes, process can handle itNo, instant death
CleanupProcess can close files, save dataNo cleanup
SpeedTakes time for graceful shutdownInstant
Data safetySafeRisk of corruption
Use whenFirst attempt to stopLast resort

kill — Send Signals

# Polite termination (SIGTERM = 15, default)
kill 12345
kill -15 12345            # Explicit SIGTERM
kill -TERM 12345          # By name
 
# How SIGTERM works:
# 1. kernel sends SIGTERM to PID 12345
# 2. Process receives signal
# 3. Process cleanup code runs (close files, flush buffers, save state)
# 4. Process exits with cleanup complete
 
# Force kill (SIGKILL = 9, last resort!)
kill -9 12345
kill -SIGKILL 12345
 
# When to use kill -9:
# - Process doesn't respond to SIGTERM
# - Process is in uninterruptible sleep (state D)
# - Process is frozen/hung
# WARNING: No cleanup happens! Files may be corrupted!
 
# Reload configuration (common with daemons)
kill -HUP 12345
# Example: nginx, sshd reload config without disconnecting users
 
# Pause a process
kill -STOP 12345
# Resume it
kill -CONT 12345
 
# Send user-defined signals
kill -USR1 12345    # App-specific behavior
kill -USR2 12345    # App-specific behavior
 
# List all signals
kill -l
 

⚠️ Always try SIGTERM before SIGKILL

Best practice for terminating processes:

  1. Try kill PID (SIGTERM) first
  2. Wait 5-10 seconds
  3. Check if it's still running: ps -p PID
  4. Only if still alive, use kill -9 PID

SIGTERM allows cleanup (closing files, releasing locks, saving state). SIGKILL is immediate termination with no cleanup—risk of corrupted data or orphaned resources.

# Safe kill script
safe_kill() {
    local pid=$1
    echo "Sending SIGTERM to $pid..."
    kill $pid 2>/dev/null || return 1
 
    for i in {1..10}; do
        sleep 1
        if ! ps -p $pid > /dev/null 2>&1; then
            echo "Process $pid terminated gracefully"
            return 0
        fi
    done
 
    echo "Process $pid didn't respond, forcing kill..."
    kill -9 $pid
}
 

killall and pkill

# Kill all processes with a specific name
killall nginx
killall -TERM nginx      # Explicit SIGTERM
killall -9 nginx         # Force kill all
 
# Kill all processes owned by a user
killall -u alice
 
# Interactive confirmation
killall -i nginx         # Prompts before each kill
 
# pkill — kill by pattern matching
pkill python             # All python processes
pkill -f "app.py"        # Match full command line
pkill -u alice python    # Alice's python processes
pkill -x python3         # Exact name match only
 
# pkill with signals
pkill -HUP nginx         # Reload nginx
pkill -TERM python       # Gracefully stop all python
 
# Caution: killall and pkill are dangerous!
# Always test with pgrep first:
pgrep -a nginx           # See what would be affected
# Then kill:
pkill nginx
 

Trapping Signals in Scripts

Scripts can catch signals and run custom cleanup code:

#!/bin/bash
# trap 'commands' SIGNALS
 
# Clean up temp files when script exits
TEMP_FILE=$(mktemp)
trap "rm -f $TEMP_FILE; echo 'Cleaned up!'" EXIT
# EXIT is a pseudo-signal that triggers when script exits for any reason
 
# Handle Ctrl+C gracefully
trap "echo 'Interrupted! Exiting...'; exit 1" INT
# INT = SIGINT (Ctrl+C)
 
# Handle termination signal
trap "echo 'Terminated! Cleaning up...'; cleanup; exit 1" TERM
 
# Ignore a signal (process won't be affected by it)
trap '' HUP              # Ignore SIGHUP
trap '' INT              # Ignore Ctrl+C (dangerous!)
 
# Reset trap to default behavior
trap - INT               # Ctrl+C works normally again
 
# Multiple signals, one handler
trap cleanup EXIT INT TERM
 

Practical trap example:

#!/bin/bash
# A robust script with cleanup
 
LOCKFILE="/var/run/myapp.lock"
LOGFILE="/var/log/myapp.log"
TEMP_DIR=$(mktemp -d)
 
# Cleanup function
cleanup() {
    echo "[$(date)] Cleaning up..." >> "$LOGFILE"
    rm -f "$LOCKFILE"
    rm -rf "$TEMP_DIR"
    echo "Cleanup complete"
}
 
# Set traps
trap cleanup EXIT INT TERM
 
# Prevent multiple instances
if [[ -f "$LOCKFILE" ]]; then
    echo "Another instance is already running (PID: $(cat $LOCKFILE))"
    exit 1
fi
echo $$ > "$LOCKFILE"
 
# Main script work
echo "[$(date)] Started (PID: $$)" >> "$LOGFILE"
 
for i in {1..10}; do
    echo "Processing step $i/10..."
    # Simulate work
    echo "Data $i" > "$TEMP_DIR/file$i.txt"
    sleep 1
done
 
echo "Done!"
# cleanup() runs automatically due to EXIT trap
 
Was this page helpful?
SR

Written by the ShellRAG Team

The ShellRAG editorial team writes practical, beginner-friendly Bash Shell tutorials with tested code examples and real-world use cases. Every article is technically reviewed for accuracy and updated regularly.

Learn more about us →