Summary: in this tutorial, you will learn understand linux processes, view and manage running processes, control jobs, and send signals.

Process Management

Every command you run—from ls to launching a web server—creates a process: an instance of a running program. Mastering process management is fundamental to system administration, debugging, and automation.

Why process management matters:

System stability: Identify and stop misbehaving programs
Performance tuning: Find resource hogs and optimize
Debugging: Understand what's running and why
Automation: Schedule tasks and manage long-running services
Job control: Run multiple tasks simultaneously

This tutorial covers viewing processes, job control, signals, system monitoring, and scheduled tasks with cron.

Understanding Processes

A process is a running instance of a program with its own:

PID (Process ID): Unique number identifying this process
PPID (Parent PID): The process that started this one
Owner: User who started it (determines permissions)
State: Running, sleeping, stopped, zombie
Priority: How much CPU time it gets
Memory: Address space with code, data, stack, heap
File descriptors: Open files, network connections, pipes

Process hierarchy:

Every process (except PID 1) has a parent. This creates a tree:

systemd (PID 1) — the init process, parent of everything
├── bash (PID 1234) — your shell
│   ├── ls (PID 1235) — command you ran
│   └── grep (PID 1236) — pipeline command
├── sshd (PID 567) — SSH server
│   └── sshd (PID 1240) — your SSH session
│       └── bash (PID 1241) — your login shell
└── nginx (PID 890) — web server
    ├── nginx (PID 891) — worker process
    └── nginx (PID 892) — worker process

When a parent process dies, its children are orphaned and adopted by PID 1 (systemd or init).

Why understanding processes matters:

Debugging: "Why is my script hanging?" → Check child processes
Resource usage: "Why is the server slow?" → Find CPU/memory hogs
Security: "What's running on my system?" → Audit processes
Cleanup: Ensure background jobs don't outlive their purpose

Viewing Processes

ps — Process Snapshot

ps shows a snapshot of current processes (not real-time):

# Your processes only (minimal info)
ps
# Output:
#   PID TTY          TIME CMD
#  1234 pts/0    00:00:00 bash
#  1235 pts/0    00:00:00 ps
 
# All processes, BSD-style (most common on Linux)
ps aux
# 'a' = all users, 'u' = user-oriented format, 'x' = include processes without TTY
 
# All processes, UNIX-style
ps -ef
# '-e' = all processes, '-f' = full format
 
# Why two styles? Historical reasons (BSD vs. System V Unix)
# Use whichever you prefer; ps aux is more common on Linux

Understanding ps aux Output


USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.1 169356 13296 ?        Ss   Jan10   0:05 /sbin/init
alice     1234  2.5  1.2 345678 98765 pts/0    S+   10:30   0:15 python app.py
bob       5678  0.1  0.5 123456 45678 ?        Ss   09:00   0:02 /usr/sbin/sshd

Column	Description	Example
USER	Process owner	alice
PID	Process ID	1234
%CPU	CPU usage percentage	2.5%
%MEM	Memory usage percentage	1.2%
VSZ	Virtual memory size (KB) — total allocated	345678 KB
RSS	Resident Set Size (KB) — actual memory in RAM	98765 KB
TTY	Terminal (`?` = no terminal/daemon)	pts/0
STAT	Process state (see below)	S+
START	When the process started	10:30
TIME	Total CPU time consumed	0:15
COMMAND	The command that started the process	python app.py

VSZ vs RSS:

VSZ (Virtual Size): Total memory allocated (includes shared libraries, mapped files)
RSS (Resident Set Size): Actual physical RAM in use
RSS is more important—it's what actually impacts available memory

Process States (STAT Column)

State	Meaning	When It Happens
R	Running or runnable	Actively using CPU or waiting for CPU time
S	Sleeping (interruptible)	Waiting for input/event (can be woken by signal)
D	Sleeping (uninterruptible)	Usually waiting for I/O (can't be interrupted)
T	Stopped	Paused by signal (Ctrl+Z) or debugger
Z	Zombie	Finished but parent hasn't read its exit status yet
<	High priority	Nice value < 0 (gets more CPU)
N	Low priority	Nice value > 0 (gets less CPU)
s	Session leader	Leader of a process group
l	Multi-threaded	Has multiple threads
+	Foreground process group	In the foreground of its terminal

Common combinations:

S+ = Sleeping, in foreground (e.g., your shell waiting for input)
Ss = Sleeping, session leader (typical daemon)
R+ = Running in foreground
Z = Zombie (usually harmless, but many zombies indicate a bug)

Sorting and Filtering ps Output

# Sort by memory usage (top 10)
ps aux --sort=-%mem | head -11
# --sort=-%mem means sort by memory, descending (- = reverse)
 
# Sort by CPU usage
ps aux --sort=-%cpu | head -11
 
# Sort by process start time
ps aux --sort=start_time
 
# Multiple sorts (CPU, then memory)
ps aux --sort=-%cpu,-%mem | head -11
 
# Custom columns
ps -eo pid,ppid,user,%cpu,%mem,cmd --sort=-%cpu | head -15
# -e = all processes
# -o = output format (specify columns)
# Available columns: pid, ppid, user, %cpu, %mem, vsz, rss, tty, stat, start, time, cmd, and many more
 
# Find specific processes
ps aux | grep nginx
ps -C nginx                      # By command name
ps -u alice                      # By user
ps -p 1234,5678                  # By PID
 
# Process tree (parent-child relationships)
ps auxf                          # 'f' = forest (ASCII tree)
ps -ejH                          # UNIX-style tree

pstree — Visualize Process Hierarchy

# Show process tree
pstree
 
# With PIDs
pstree -p
 
# For a specific user
pstree alice
 
# For a specific process and its descendants
pstree -p 1234
 
# Example output:
# systemd─┬─ModemManager───2*[{ModemManager}]
#         ├─NetworkManager───2*[{NetworkManager}]
#         ├─accounts-daemon───2*[{accounts-daemon}]
#         ├─bash───pstree
#         ├─nginx───nginx
#         └─sshd───sshd───bash───python

top — Real-Time Process Monitor

top continuously updates to show process activity:

top
 
# Key information in top:
# - Load average (1, 5, 15 minute averages)
# - Tasks: running, sleeping, stopped, zombie
# - %CPU: user processes, system, idle, I/O wait
# - Memory: total, used, free, buffers, cache
# - Process list: sorted by CPU usage by default

top interactive commands:

Key	Action	Why Use It
`q`	Quit	Exit top
`M`	Sort by memory	Find memory hogs
`P`	Sort by CPU	Find CPU hogs (default)
`T`	Sort by time	Find long-running processes
`k`	Kill a process	Terminate misbehaving process
`r`	Renice	Change process priority
`1`	Toggle per-CPU view	See load on each core
`c`	Toggle full command	See complete command line
`u`	Filter by user	Show only one user's processes
`f`	Choose display fields	Customize what columns show
`E`	Change memory units	Switch between KB, MB, GB
`h` or `?`	Help	Show all commands

top header explained:


top - 10:30:15 up 15 days,  3:45,  2 users,  load average: 0.52, 0.45, 0.38
Tasks: 287 total,   1 running, 286 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.2 us,  1.5 sy,  0.0 ni, 94.8 id,  0.5 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  16384.0 total,   3218.5 free,   5342.2 used,   7823.3 buff/cache
MiB Swap:   2048.0 total,   2048.0 free,      0.0 used.  10127.8 avail Mem

Load average: 0.52, 0.45, 0.38 = 1-min, 5-min, 15-min averages
- If load > number of CPU cores, system is overloaded
- Check cores: nproc
CPU breakdown:
- us (user): CPU time in user processes
- sy (system): CPU time in kernel
- id (idle): CPU doing nothing
- wa (wait): Waiting for I/O (high = disk bottleneck)
Memory: avail Mem is most important—actual memory available for new programs

htop — Improved top

htop is a more user-friendly alternative to top:

# Install htop
sudo apt install htop    # Debian/Ubuntu
sudo yum install htop    # CentOS/RHEL
sudo dnf install htop    # Fedora
 
# Run htop
htop

htop advantages:

Color-coded bars for CPU, memory, swap
Mouse support (click to sort, select processes)
Tree view by default (F5)
Easier to kill/renice processes (F9/F7)
Search processes (F3)
Filter by user (F4)

pgrep and pkill — Find and Signal by Name

# Find PIDs by name
pgrep nginx
# Output: 890 891 892
 
# Show PID and name
pgrep -l nginx
# Output: 890 nginx
#         891 nginx
#         892 nginx
 
# Show PID and full command
pgrep -a nginx
# Output: 890 nginx: master process /usr/sbin/nginx
#         891 nginx: worker process
 
# Find by user
pgrep -u alice
pgrep -u alice python    # Alice's python processes
 
# Count matching processes
pgrep -c nginx
# Output: 3
 
# Most recent match
pgrep -n nginx
 
# Oldest match
pgrep -o nginx
 
# Exact match only
pgrep -x bash    # Matches "bash" but not "bash-script"
 
# Kill by name (careful!)
pkill nginx
pkill -u alice python    # Kill alice's python processes
pkill -f "python app.py" # Kill by full command line

Job Control

Job control lets you run multiple processes from one shell, switching between foreground and background.

Foreground vs Background

Foreground: Process takes over your terminal; you can't run other commands until it finishes
Background: Process runs independently; you can continue working

# Run in foreground (default)
sleep 60
# Terminal blocked for 60 seconds
 
# Run in background (add & at the end)
sleep 60 &
# [1] 12345              ← Job number 1, PID 12345
# Terminal immediately available
 
# Multiple background jobs
sleep 30 &    # Job 1
sleep 40 &    # Job 2
sleep 50 &    # Job 3

Managing Jobs

# List jobs
jobs
# [1]   Running                 sleep 30 &
# [2]-  Running                 sleep 40 &
# [3]+  Running                 sleep 50 &
# + = most recent job, - = second most recent
 
# List with PIDs
jobs -l
# [1]  12345 Running             sleep 30 &
# [2]  12346 Running             sleep 40 &
 
# Bring a background job to foreground
fg %1                    # By job number
fg                       # Most recent job (the + one)
 
# Suspend a foreground process (pause it)
# Run: sleep 60
# Press: Ctrl+Z
# Output: [1]+  Stopped     sleep 60
 
# Resume a stopped job in the background
bg %1
# [1]+ sleep 60 &
 
# Resume in foreground
fg %1

Practical workflow:

# Start editing a file
vim large_file.txt
 
# Realize you need to run a command
# Press Ctrl+Z
# [1]+  Stopped                 vim large_file.txt
 
# Run your command
ls -l
 
# Return to editing
fg

Long-Running Background Jobs

# Run a command, then decide to background it
find / -name "*.log" 2>/dev/null    # Oops, this is slow!
# Press Ctrl+Z to suspend
# [1]+  Stopped     find / -name "*.log" 2>/dev/null
bg
# Now it runs in background while you continue working
 
# Problem: If you log out, background jobs are killed!
# Solution: nohup or disown

nohup — Survive Logout

When you close your terminal, all your processes receive SIGHUP (hangup) and die. nohup (no hangup) prevents this:

# Run a command that survives logout
nohup ./long_running_script.sh &
# Output appended to nohup.out by default
 
# Redirect output to a specific file
nohup ./script.sh > output.log 2>&1 &
# > output.log = redirect stdout
# 2>&1 = redirect stderr to same file as stdout
# & = run in background
 
# Check if it's running after logout/login
ps aux | grep script.sh

disown — Detach from Shell

disown removes a job from the shell's job table, preventing SIGHUP:

# Start a job
./long_script.sh &
# [1] 12345
 
# Disown it (shell forgets about it)
disown %1
 
# Or disown most recent job
disown
 
# Now you can close the terminal without killing the process
 
# Verify it's still running
ps -p 12345

nohup vs disown:

nohup: Run from the start with hangup immunity
disown: Remove an already-running job from shell control

Signals

Signals are messages sent to processes to control behavior. Think of them as inter-process "push notifications."

Why signals matter:

Terminate processes: Stop misbehaving programs
Reload configuration: Tell services to re-read config files without restarting
Debugging: Trigger core dumps for analysis
Process control: Pause, resume, or notify processes

Common Signals

Signal	Number	Shortcut	Default Action	Description	Use Case
SIGHUP	1	-	Terminate	Hangup (terminal closed)	Reload config
SIGINT	2	Ctrl+C	Terminate	Interrupt	Stop a command
SIGQUIT	3	Ctrl+\	Core dump	Quit with core dump	Debug crash
SIGKILL	9	-	Terminate	Kill (cannot be caught!)	Force kill
SIGTERM	15	-	Terminate	Polite termination request	Graceful shutdown
SIGSTOP	19	Ctrl+Z	Stop	Pause process (cannot be caught!)	-
SIGCONT	18	-	Continue	Resume a stopped process	Resume after Ctrl+Z
SIGUSR1	10	-	Terminate	User-defined signal 1	Custom use
SIGUSR2	12	-	Terminate	User-defined signal 2	Custom use

Key differences:

Aspect	SIGTERM (15)	SIGKILL (9)
Can be caught	Yes, process can handle it	No, instant death
Cleanup	Process can close files, save data	No cleanup
Speed	Takes time for graceful shutdown	Instant
Data safety	Safe	Risk of corruption
Use when	First attempt to stop	Last resort

kill — Send Signals

# Polite termination (SIGTERM = 15, default)
kill 12345
kill -15 12345            # Explicit SIGTERM
kill -TERM 12345          # By name
 
# How SIGTERM works:
# 1. kernel sends SIGTERM to PID 12345
# 2. Process receives signal
# 3. Process cleanup code runs (close files, flush buffers, save state)
# 4. Process exits with cleanup complete
 
# Force kill (SIGKILL = 9, last resort!)
kill -9 12345
kill -SIGKILL 12345
 
# When to use kill -9:
# - Process doesn't respond to SIGTERM
# - Process is in uninterruptible sleep (state D)
# - Process is frozen/hung
# WARNING: No cleanup happens! Files may be corrupted!
 
# Reload configuration (common with daemons)
kill -HUP 12345
# Example: nginx, sshd reload config without disconnecting users
 
# Pause a process
kill -STOP 12345
# Resume it
kill -CONT 12345
 
# Send user-defined signals
kill -USR1 12345    # App-specific behavior
kill -USR2 12345    # App-specific behavior
 
# List all signals
kill -l

⚠️ Always try SIGTERM before SIGKILL

Best practice for terminating processes:

Try kill PID (SIGTERM) first
Wait 5-10 seconds
Check if it's still running: ps -p PID
Only if still alive, use kill -9 PID

SIGTERM allows cleanup (closing files, releasing locks, saving state). SIGKILL is immediate termination with no cleanup—risk of corrupted data or orphaned resources.

# Safe kill script
safe_kill() {
    local pid=$1
    echo "Sending SIGTERM to $pid..."
    kill $pid 2>/dev/null || return 1
 
    for i in {1..10}; do
        sleep 1
        if ! ps -p $pid > /dev/null 2>&1; then
            echo "Process $pid terminated gracefully"
            return 0
        fi
    done
 
    echo "Process $pid didn't respond, forcing kill..."
    kill -9 $pid
}

killall and pkill

# Kill all processes with a specific name
killall nginx
killall -TERM nginx      # Explicit SIGTERM
killall -9 nginx         # Force kill all
 
# Kill all processes owned by a user
killall -u alice
 
# Interactive confirmation
killall -i nginx         # Prompts before each kill
 
# pkill — kill by pattern matching
pkill python             # All python processes
pkill -f "app.py"        # Match full command line
pkill -u alice python    # Alice's python processes
pkill -x python3         # Exact name match only
 
# pkill with signals
pkill -HUP nginx         # Reload nginx
pkill -TERM python       # Gracefully stop all python
 
# Caution: killall and pkill are dangerous!
# Always test with pgrep first:
pgrep -a nginx           # See what would be affected
# Then kill:
pkill nginx

Trapping Signals in Scripts

Scripts can catch signals and run custom cleanup code:

#!/bin/bash
# trap 'commands' SIGNALS
 
# Clean up temp files when script exits
TEMP_FILE=$(mktemp)
trap "rm -f $TEMP_FILE; echo 'Cleaned up!'" EXIT
# EXIT is a pseudo-signal that triggers when script exits for any reason
 
# Handle Ctrl+C gracefully
trap "echo 'Interrupted! Exiting...'; exit 1" INT
# INT = SIGINT (Ctrl+C)
 
# Handle termination signal
trap "echo 'Terminated! Cleaning up...'; cleanup; exit 1" TERM
 
# Ignore a signal (process won't be affected by it)
trap '' HUP              # Ignore SIGHUP
trap '' INT              # Ignore Ctrl+C (dangerous!)
 
# Reset trap to default behavior
trap - INT               # Ctrl+C works normally again
 
# Multiple signals, one handler
trap cleanup EXIT INT TERM

Practical trap example:

#!/bin/bash
# A robust script with cleanup
 
LOCKFILE="/var/run/myapp.lock"
LOGFILE="/var/log/myapp.log"
TEMP_DIR=$(mktemp -d)
 
# Cleanup function
cleanup() {
    echo "[$(date)] Cleaning up..." >> "$LOGFILE"
    rm -f "$LOCKFILE"
    rm -rf "$TEMP_DIR"
    echo "Cleanup complete"
}
 
# Set traps
trap cleanup EXIT INT TERM
 
# Prevent multiple instances
if [[ -f "$LOCKFILE" ]]; then
    echo "Another instance is already running (PID: $(cat $LOCKFILE))"
    exit 1
fi
echo $$ > "$LOCKFILE"
 
# Main script work
echo "[$(date)] Started (PID: $$)" >> "$LOGFILE"
 
for i in {1..10}; do
    echo "Processing step $i/10..."
    # Simulate work
    echo "Data $i" > "$TEMP_DIR/file$i.txt"
    sleep 1
done
 
echo "Done!"
# cleanup() runs automatically due to EXIT trap

Process Management

Written by the ShellRAG Team