Regular Expressions

Summary: in this tutorial, you will learn master pattern matching in powershell. learn regex syntax, -match operator, select-string, [regex] class, named captures, and practical text processing patterns.

Regular Expressions

Regular expressions (regex) are a powerful pattern-matching language for text processing. In PowerShell, regex integrates seamlessly with operators and cmdlets, enabling sophisticated text search, validation, and transformation.

This chapter teaches you regex from fundamentals to advanced patterns, focusing on PowerShell-specific features like the -match operator, Select-String cmdlet, and the [regex] .NET class.

Why Learn Regular Expressions

Text processing is everywhere:

  • Log analysis: Extract error codes, timestamps, IP addresses
  • Data validation: Verify email formats, phone numbers, credit cards
  • Text transformation: Replace patterns, extract data
  • File searching: Find files containing specific patterns

Without regex, you'd write brittle code with IndexOf, Substring, and fragile string manipulation. Regex provides a declarative language for pattern matching.

Regex Basics: Literal Matching

The simplest regex matches literal characters:

# Does the string contain "error"?
"Error occurred" -match "error"       # True (case-insensitive by default)
"Success" -match "error"              # False
 
# Case-sensitive matching
"Error occurred" -cmatch "error"      # False (lowercase "e" doesn't match "E")
"Error occurred" -cmatch "Error"      # True
 
# Case-insensitive (explicit)
"Error occurred" -imatch "ERROR"      # True
 

PowerShell's -match operator:

  • Returns True if the pattern matches anywhere in the string
  • Case-insensitive by default (unlike many languages)
  • Use -cmatch for case-sensitive matching

Capturing Matched Text

When -match succeeds, PowerShell populates the automatic variable $Matches:

"Server: 192.168.1.100" -match "(\d+\.\d+\.\d+\.\d+)"
$Matches[0]    # Full match: "192.168.1.100"
$Matches[1]    # First capture group: "192.168.1.100"
 
# Multiple captures
"Error code: 404" -match "Error code: (\d+)"
$Matches[1]    # "404"
 
# Named captures
"User: alice" -match "User: (?<username>\w+)"
$Matches.username    # "alice"
$Matches['username'] # Same thing
 

Character Classes: Matching Sets

Character classes match one character from a set:

# Basic classes
"cat" -match "[abc]"       # True (contains 'c' or 'a')
"dog" -match "[abc]"       # False
 
# Ranges
"file1.txt" -match "[0-9]"           # True (contains digit)
"fileA.txt" -match "[a-z]"           # True (contains lowercase letter)
"FILE1" -match "[A-Z]"               # True (contains uppercase letter)
 
# Negated classes (NOT)
"file1.txt" -match "[^0-9]"          # True (contains non-digit)
"12345" -match "[^0-9]"              # False (only digits)
 
# Predefined classes
"file_123" -match "\d"      # \d = digit [0-9]
"file_123" -match "\D"      # \D = non-digit [^0-9]
"hello world" -match "\s"   # \s = whitespace (space, tab, newline)
"hello" -match "\S"         # \S = non-whitespace
"var_name" -match "\w"      # \w = word character [a-zA-Z0-9_]
"@#$" -match "\W"           # \W = non-word character
 

Common character classes:

  • \d — Digit (0-9)
  • \w — Word character (letters, digits, underscore)
  • \s — Whitespace (space, tab, newline)
  • . — Any character (except newline)

Escaping special characters:

Many characters have special meaning in regex (. * + ? [ ] { } ( ) ^ $ | \). To match them literally, escape with \:

# Wrong: . matches ANY character
"test.txt" -match "test.txt"    # Matches "testAtxt" too!
 
# Correct: \. matches literal dot
"test.txt" -match "test\.txt"   # True
"testAtxt" -match "test\.txt"   # False
 
# Escape other special characters
"Price: $100" -match "\$\d+"    # \$ = literal dollar sign
"[ERROR]" -match "\[ERROR\]"    # \[ \] = literal brackets
 

Quantifiers: Repetition

Quantifiers specify how many times a pattern should repeat:

# * = 0 or more
"file" -match "file\d*"         # Matches "file", "file1", "file123"
"" -match "a*"                  # True (0 'a' characters)
 
# + = 1 or more
"file" -match "file\d+"         # False (no digits)
"file1" -match "file\d+"        # True (at least one digit)
 
# ? = 0 or 1 (optional)
"color" -match "colou?r"        # Matches "color" or "colour"
"honour" -match "honou?r"       # True
 
# {n} = exactly n times
"file123" -match "\d{3}"        # True (exactly 3 digits)
"file12" -match "\d{3}"         # False
 
# {n,} = n or more times
"file123" -match "\d{2,}"       # True (at least 2 digits)
 
# {n,m} = between n and m times
"192.168.1.1" -match "\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"    # IP address pattern
 

Greedy vs Lazy Quantifiers

By default, quantifiers are greedy — they match as much as possible:

# Greedy: .* matches as much as possible
"<div>Hello</div>" -match "<.*>"
$Matches[0]    # "<div>Hello</div>" (entire string)
 
# Lazy: .*? matches as little as possible
"<div>Hello</div>" -match "<.*?>"
$Matches[0]    # "<div>" (stops at first >)
 

Make any quantifier lazy by adding ?:

  • *? — 0 or more (lazy)
  • +? — 1 or more (lazy)
  • ?? — 0 or 1 (lazy)
  • {n,}? — n or more (lazy)

Anchors: Position Matching

Anchors match positions, not characters:

# ^ = start of string
"hello" -match "^h"         # True (starts with 'h')
"ahello" -match "^h"        # False
 
# $ = end of string
"hello" -match "o$"         # True (ends with 'o')
"hello " -match "o$"        # False (ends with space)
 
# \b = word boundary
"hello world" -match "\bhello\b"    # True (whole word)
"helloworld" -match "\bhello\b"     # False (not a whole word)
 
# Full line matching
"192.168.1.1" -match "^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$"    # True
"IP: 192.168.1.1" -match "^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$"  # False (extra text)
 

When to use anchors:

  • ^ and $ — Validate entire string format (emails, phone numbers)
  • \b — Match whole words only (avoid partial matches)

Groups and Capturing

Parentheses () create groups for capturing or applying quantifiers:

# Capture groups extract parts of the match
"Error code: 404" -match "Error code: (\d+)"
$Matches[1]    # "404"
 
# Multiple captures
"Date: 2026-02-11" -match "(\d{4})-(\d{2})-(\d{2})"
$Matches[1]    # "2026" (year)
$Matches[2]    # "02" (month)
$Matches[3]    # "11" (day)
 
# Named captures (much clearer!)
"Date: 2026-02-11" -match "(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})"
$Matches.year    # "2026"
$Matches.month   # "02"
$Matches.day     # "11"
 
# Non-capturing groups (?: ) — group without capturing
"file123.txt" -match "file(?:\d+)\.txt"    # Groups digits but doesn't capture
 
# Groups with quantifiers
"192.168.1.1" -match "^(\d{1,3}\.){3}\d{1,3}$"    # Match IP pattern (repeat "nnn." 3 times)
 

Named captures are powerful for extracting structured data:

$logLine = "[2026-02-11 14:32:05] ERROR: Connection timeout"
 
$pattern = '^\[(?<date>[\d-]+) (?<time>[\d:]+)\] (?<level>\w+): (?<message>.+)$'
 
if ($logLine -match $pattern) {
    [PSCustomObject]@{
        Date    = $Matches.date
        Time    = $Matches.time
        Level   = $Matches.level
        Message = $Matches.message
    }
}
 

Alternation: OR Logic

Use | to match one pattern or another:

# Match "cat" or "dog"
"I have a cat" -match "cat|dog"     # True
"I have a bird" -match "cat|dog"    # False
 
# Match file extensions
"document.pdf" -match "\.(pdf|docx|txt)$"    # True
"image.png" -match "\.(pdf|docx|txt)$"       # False
 
# Match multiple error levels
"[WARNING] Disk space low" -match "\[(ERROR|WARNING|INFO)\]"
$Matches[1]    # "WARNING"
 

Common Practical Patterns

Email Validation

$email = "user@example.com"
$emailPattern = '^[\w\.-]+@[\w\.-]+\.\w{2,}$'
 
$email -match $emailPattern    # True
 
# Breakdown:
# ^            — Start of string
# [\w\.-]+     — Username (letters, digits, dot, dash)
# @            — Literal @
# [\w\.-]+     — Domain name
# \.           — Literal dot
# \w{2,}       — TLD (2+ letters)
# $            — End of string
 

Phone Numbers

# US format: (123) 456-7890 or 123-456-7890
$phone = "(555) 123-4567"
$phonePattern = '^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$'
 
$phone -match $phonePattern    # True
 
# Extract parts
$phone -match '^\(?(?<area>\d{3})\)?[-.\s]?(?<prefix>\d{3})[-.\s]?(?<line>\d{4})$'
$Matches.area      # "555"
$Matches.prefix    # "123"
$Matches.line      # "4567"
 

IP Addresses

$ip = "192.168.1.100"
$ipPattern = '^(\d{1,3}\.){3}\d{1,3}$'
 
$ip -match $ipPattern    # True
 
# More precise (0-255 range) - complex but accurate
$ipPrecise = '^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$'
 
"192.168.1.300" -match $ipPrecise    # False (300 > 255)
 

URLs

$url = "https://www.example.com/path?query=value"
$urlPattern = '^https?://[\w\.-]+(?:/[\w\.-]*)*(?:\?[\w=&]*)?$'
 
$url -match $urlPattern    # True
 
# Extract parts
$url -match '^(?<protocol>https?)://(?<domain>[\w\.-]+)(?<path>/[\w\.-]*)*(?<query>\?[\w=&]*)?$'
$Matches.protocol    # "https"
$Matches.domain      # "www.example.com"
$Matches.path        # "/path"
$Matches.query       # "?query=value"
 

Select-String: PowerShell's Grep

Select-String searches files and text for regex patterns:

# Search file for pattern
Select-String -Path "app.log" -Pattern "ERROR"
 
# Case-sensitive
Select-String -Path "app.log" -Pattern "ERROR" -CaseSensitive
 
# Multiple files
Select-String -Path "*.log" -Pattern "Connection timeout"
 
# Recursive search
Select-String -Path "C:\Logs\*.log" -Pattern "ERROR" -Recurse
 
# Show context (lines before/after)
Select-String -Path "app.log" -Pattern "ERROR" -Context 2,3    # 2 lines before, 3 after
 
# Multiple patterns
Select-String -Path "app.log" -Pattern "ERROR|WARNING|CRITICAL"
 
# Invert match (lines that DON'T match)
Select-String -Path "app.log" -Pattern "DEBUG" -NotMatch
 

Processing Select-String Results

# Get matches
$matches = Select-String -Path "*.log" -Pattern "Error code: (\d+)"
 
foreach ($match in $matches) {
    [PSCustomObject]@{
        File     = $match.Filename
        Line     = $match.LineNumber
        Text     = $match.Line
        ErrorCode = $match.Matches.Groups[1].Value
    }
} | Format-Table -AutoSize
 
# Extract all IP addresses from logs
Select-String -Path "access.log" -Pattern "\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}" -AllMatches |
    ForEach-Object { $_.Matches.Value } |
    Sort-Object -Unique
 

The [regex] Class: Advanced Operations

PowerShell's [regex] class provides more control than operators:

# Create regex object
$pattern = [regex]'ERROR: (?<code>\d+)'
 
# Test match
$pattern.IsMatch("ERROR: 404")    # True
 
# Get match details
$match = $pattern.Match("ERROR: 404")
$match.Success                      # True
$match.Value                        # "ERROR: 404"
$match.Groups['code'].Value         # "404"
 
# Find all matches
$text = "ERROR: 404, ERROR: 500, ERROR: 503"
$pattern.Matches($text) | ForEach-Object {
    $_.Groups['code'].Value
}    # Output: 404, 500, 503
 

Replace with Regex

# Simple replacement
$text = "Phone: 123-456-7890"
$text -replace "\d{3}-\d{3}-\d{4}", "[REDACTED]"    # "Phone: [REDACTED]"
 
# Use capture groups in replacement
$text = "Date: 2026-02-11"
$text -replace "(\d{4})-(\d{2})-(\d{2})", '$3/$2/$1'    # "Date: 11/02/2026"
 
# Named captures in replacement
$text = "Hello, Alice!"
$text -replace "Hello, (?<name>\w+)!", 'Greetings, ${name}.'    # "Greetings, Alice."
 
# Replace with expression
$text = "Price: $50"
[regex]::Replace($text, '\$(\d+)', {
    param($match)
    '$' + ([int]$match.Groups[1].Value * 1.10)
})    # "Price: $55" (10% increase)
 

Split with Regex

# Split by multiple delimiters
$text = "apple,banana;cherry|date"
$text -split '[,;|]'    # Array: apple, banana, cherry, date
 
# Split by whitespace (any amount)
"word1    word2  word3" -split '\s+'    # Array: word1, word2, word3
 
# Split and keep delimiters
$text = "Part1:Part2:Part3"
[regex]::Split($text, '(:)')    # Array: Part1, :, Part2, :, Part3
 

Performance Considerations

Regex can be expensive. Optimize where possible:

# Slow: Recompile regex each iteration
1..1000 | ForEach-Object {
    "test$_" -match "test\d+"
}
 
# Fast: Compile once, reuse
$pattern = [regex]'test\d+'
1..1000 | ForEach-Object {
    $pattern.IsMatch("test$_")
}
 
# Even faster: Compiled regex
$pattern = [regex]::new('test\d+', [System.Text.RegularExpressions.RegexOptions]::Compiled)
 

Use compiled regex for patterns used repeatedly in loops.


Exercises

🏋️ Exercise 1: Log Parser

Parse Apache-style log files:


192.168.1.100 - - [11/Feb/2026:14:32:05 +0000] "GET /api/users HTTP/1.1" 200 1234

Extract IP, date, method, path, status code, and size into objects.

Show Solution
function Parse-ApacheLog {
    [CmdletBinding()]
    param(
        [Parameter(Mandatory, ValueFromPipeline)]
        [string]$LogLine
    )
 
    begin {
        # Pattern breaks down:
        # IP: \d{1,3}(\.\d{1,3}){3}
        # Date: \[([^\]]+)\]
        # Method/Path: "(\w+) ([^"]+) HTTP
        # Status: (\d{3})
        # Size: (\d+)
 
        $pattern = @'
^(?<ip>\d{1,3}(?:\.\d{1,3}){3})\s+-\s+-\s+\[(?<date>[^\]]+)\]\s+"(?<method>\w+)\s+(?<path>[^"]+)\s+HTTP/[\d\.]+"\s+(?<status>\d{3})\s+(?<size>\d+)
'@
    }
 
    process {
        if ($LogLine -match $pattern) {
            [PSCustomObject]@{
                IPAddress  = $Matches.ip
                DateTime   = $Matches.date
                Method     = $Matches.method
                Path       = $Matches.path
                StatusCode = [int]$Matches.status
                Bytes      = [int]$Matches.size
            }
        }
        else {
            Write-Warning "Failed to parse: $LogLine"
        }
    }
}
 
# Test
$logs = @(
    '192.168.1.100 - - [11/Feb/2026:14:32:05 +0000] "GET /api/users HTTP/1.1" 200 1234'
    '192.168.1.101 - - [11/Feb/2026:14:32:10 +0000] "POST /api/login HTTP/1.1" 401 89'
    '192.168.1.102 - - [11/Feb/2026:14:32:15 +0000] "GET /images/logo.png HTTP/1.1" 404 0'
)
 
$parsed = $logs | Parse-ApacheLog
$parsed | Format-Table -AutoSize
 
# Analyze
$parsed | Group-Object StatusCode | Select-Object Name, Count
$parsed | Where-Object { $_.StatusCode -ge 400 } | Format-Table
 
🏋️ Exercise 2: Data Validator

Create validators for:

  • Email addresses
  • Phone numbers (US format)
  • Credit card numbers (basic format, not real validation)
  • URLs
Show Solution
function Test-DataFormat {
    [CmdletBinding()]
    param(
        [Parameter(Mandatory)]
        [string]$Data,
 
        [Parameter(Mandatory)]
        [ValidateSet('Email', 'Phone', 'CreditCard', 'URL')]
        [string]$Type
    )
 
    $patterns = @{
        Email = '^[\w\.-]+@[\w\.-]+\.\w{2,}$'
        Phone = '^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$'
        CreditCard = '^\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}$'
        URL = '^https?://[\w\.-]+(?:/[\w\.-]*)*(?:\?[\w=&]*)?$'
    }
 
    $result = $Data -match $patterns[$Type]
 
    [PSCustomObject]@{
        Data     = $Data
        Type     = $Type
        IsValid  = $result
        Pattern  = $patterns[$Type]
    }
}
 
# Test cases
$testData = @(
    @{ Data = 'user@example.com'; Type = 'Email' }
    @{ Data = 'invalid-email'; Type = 'Email' }
    @{ Data = '(555) 123-4567'; Type = 'Phone' }
    @{ Data = '555-123-4567'; Type = 'Phone' }
    @{ Data = '12345'; Type = 'Phone' }
    @{ Data = '1234-5678-9012-3456'; Type = 'CreditCard' }
    @{ Data = '1234 5678 9012 3456'; Type = 'CreditCard' }
    @{ Data = 'https://www.example.com/path?q=test'; Type = 'URL' }
    @{ Data = 'not-a-url'; Type = 'URL' }
)
 
$results = $testData | ForEach-Object {
    Test-DataFormat -Data $_.Data -Type $_.Type
}
 
$results | Format-Table -AutoSize
 
# Summary
Write-Host "`nValidation Summary:" -ForegroundColor Cyan
$results | Group-Object Type | ForEach-Object {
    $valid = ($_.Group | Where-Object IsValid).Count
    $total = $_.Count
    Write-Host "$($_.Name): $valid/$total valid" -ForegroundColor $(if ($valid -eq $total) { 'Green' } else { 'Yellow' })
}
 
🏋️ Exercise 3: Text Transformation Tool

Create a function that:

  • Redacts sensitive data (SSNs, credit cards, emails)
  • Replaces with [REDACTED]
  • Handles multiple patterns
Show Solution
function Protect-SensitiveData {
    [CmdletBinding()]
    param(
        [Parameter(Mandatory, ValueFromPipeline)]
        [string]$Text,
 
        [Parameter()]
        [ValidateSet('SSN', 'CreditCard', 'Email', 'Phone', 'All')]
        [string[]]$RedactTypes = 'All'
    )
 
    begin {
        # Define patterns
        $patterns = @{
            SSN = @{
                Pattern = '\b\d{3}-\d{2}-\d{4}\b'
                Label = '[SSN REDACTED]'
            }
            CreditCard = @{
                Pattern = '\b\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}\b'
                Label = '[CARD REDACTED]'
            }
            Email = @{
                Pattern = '\b[\w\.-]+@[\w\.-]+\.\w{2,}\b'
                Label = '[EMAIL REDACTED]'
            }
            Phone = @{
                Pattern = '\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}'
                Label = '[PHONE REDACTED]'
            }
        }
 
        # Determine which patterns to apply
        if ($RedactTypes -contains 'All') {
            $typesToRedact = $patterns.Keys
        }
        else {
            $typesToRedact = $RedactTypes
        }
    }
 
    process {
        $redacted = $Text
        $redactionsMade = @()
 
        foreach ($type in $typesToRedact) {
            $pattern = $patterns[$type].Pattern
            $label = $patterns[$type].Label
 
            $matches = [regex]::Matches($redacted, $pattern)
 
            if ($matches.Count -gt 0) {
                $redactionsMade += "$type ($($matches.Count) occurrence(s))"
                $redacted = $redacted -replace $pattern, $label
            }
        }
 
        [PSCustomObject]@{
            Original = $Text
            Redacted = $redacted
            RedactionsMade = $redactionsMade -join ', '
        }
    }
}
 
# Test
$testText = @"
Customer contact:
Name: John Doe
Email: john.doe@example.com
Phone: (555) 123-4567
SSN: 123-45-6789
Credit Card: 4532-1234-5678-9010
Please reach out via email or phone for assistance.
"@
 
Write-Host "Original Text:" -ForegroundColor Yellow
Write-Host $testText
 
Write-Host "`nRedacted Text:" -ForegroundColor Cyan
$result = Protect-SensitiveData -Text $testText
Write-Host $result.Redacted
 
Write-Host "`nRedactions Made:" -ForegroundColor Green
Write-Host $result.RedactionsMade
 
# Test selective redaction
Write-Host "`n--- Redact Emails Only ---" -ForegroundColor Magenta
$emailOnly = Protect-SensitiveData -Text $testText -RedactTypes Email
Write-Host $emailOnly.Redacted
 

Summary

Regular expressions are essential for text processing in PowerShell:

  • Basic patterns: Literals, character classes (\d, \w, \s), quantifiers (*, +, ?, {n,m})
  • Anchors: ^ (start), $ (end), \b (word boundary)
  • Groups: Capture with (), name with (?<name>...), extract with $Matches
  • Operators: -match, -replace, -split (case-insensitive), -cmatch (case-sensitive)
  • Select-String: Search files with context, recursion, multiple patterns
  • [regex] class: Compile patterns, fine control over matching and replacement

Master regex to handle log parsing, data validation, text transformation, and complex string operations efficiently.

Was this page helpful?
SR

Written by the ShellRAG Team

The ShellRAG editorial team writes practical, beginner-friendly PowerShell tutorials with tested code examples and real-world use cases. Every article is technically reviewed for accuracy and updated regularly.

Learn more about us →