Introduction to grep
The grep command (Global Regular Expression Print) is one of the most powerful and commonly used text search utilities in Unix/Linux systems. It searches files or input for lines matching a given pattern and prints the matching lines. Understanding grep is essential for log analysis, data extraction, and general text processing.
Basic Syntax
grep [options] pattern [file...]
1. Basic Usage
Simple Pattern Searching
# Search for pattern in a file grep "error" logfile.txt # Search in multiple files grep "warning" file1.txt file2.txt file3.txt # Search in all files in current directory grep "pattern" * # Case-insensitive search grep -i "error" logfile.txt # Search for exact word (not substring) grep -w "the" file.txt
Output Examples
$ grep "error" app.log 2024-03-10 10:30:15 [ERROR] Database connection failed 2024-03-10 10:31:22 [ERROR] Timeout occurred $ grep -i "warning" system.log 2024-03-10 09:15:33 [WARNING] High memory usage 2024-03-10 09:45:12 [Warning] Disk space low
2. Common Options
Line Control Options
# -n: Show line numbers grep -n "pattern" file.txt # -b: Show byte offset grep -b "pattern" file.txt # -o: Show only matching part grep -o "pattern" file.txt # -c: Count matching lines grep -c "pattern" file.txt # -l: List only filenames with matches grep -l "pattern" *.txt # -L: List files without matches grep -L "pattern" *.txt
Context Control
# -A NUM: Show NUM lines after match grep -A 2 "error" logfile.txt # -B NUM: Show NUM lines before match grep -B 2 "error" logfile.txt # -C NUM: Show NUM lines before and after match grep -C 2 "error" logfile.txt # Show context with custom separator grep -A 1 --group-separator="----" "error" logfile.txt
Examples with Context
$ grep -B 1 -A 2 "ERROR" app.log 2024-03-10 10:30:14 [INFO] Processing request 2024-03-10 10:30:15 [ERROR] Database connection failed 2024-03-10 10:30:16 [INFO] Retry attempt 1 2024-03-10 10:30:17 [INFO] Retry attempt 2
3. Regular Expressions
Basic Regular Expressions (BRE)
# ^ - Start of line grep "^The" file.txt # Lines starting with "The" # $ - End of line grep "end$" file.txt # Lines ending with "end" # . - Any single character grep "b.t" file.txt # Matches "bat", "bet", "bit", etc. # * - Zero or more of previous character grep "ab*c" file.txt # Matches "ac", "abc", "abbc", etc. # [] - Character class grep "[aeiou]" file.txt # Lines containing vowels grep "[0-9]" file.txt # Lines containing digits grep "[A-Za-z]" file.txt # Lines containing letters # [^] - Negated character class grep "[^0-9]" file.txt # Lines with non-digits
Extended Regular Expressions (ERE) with -E
# + - One or more of previous character
grep -E "ab+c" file.txt # Matches "abc", "abbc", "abbbc"
# ? - Zero or one of previous character
grep -E "colou?r" file.txt # Matches "color" and "colour"
# | - Alternation
grep -E "error|warning" file.txt # Matches either word
# () - Grouping
grep -E "(error|warning):" file.txt
# {} - Quantifiers
grep -E "[0-9]{3}-[0-9]{4}" file.txt # Phone numbers
# \b - Word boundary
grep -E "\bthe\b" file.txt # Matches "the" as whole word
Perl-Compatible Regular Expressions (PCRE) with -P
# \d - Digit
grep -P "\d{3}-\d{4}" file.txt
# \s - Whitespace
grep -P "\s+error\s+" file.txt
# \w - Word character
grep -P "\w+@\w+\.\w+" file.txt # Simple email pattern
# Lookahead/lookbehind
grep -P "(?<=ERROR:).*" file.txt # Text after "ERROR:"
# Non-greedy matching
grep -P "start.*?end" file.txt
4. Advanced Pattern Matching
Multiple Patterns
# -e: Specify multiple patterns grep -e "error" -e "warning" -e "fatal" logfile.txt # -f: Read patterns from file grep -f patterns.txt logfile.txt # Using extended regex with multiple patterns grep -E "error|warning|fatal" logfile.txt # Combining patterns with AND logic grep "error" logfile.txt | grep "database"
Pattern File Example
# patterns.txt error warning fatal critical # Use with grep grep -f patterns.txt app.log
Complex Pattern Examples
# IP addresses
grep -E "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" access.log
# Email addresses
grep -E "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" emails.txt
# URLs
grep -E "https?://[^\s]+" file.txt
# Dates (YYYY-MM-DD)
grep -E "\b[0-9]{4}-[0-9]{2}-[0-9]{2}\b" logfile.txt
# Times (HH:MM:SS)
grep -E "\b[0-9]{2}:[0-9]{2}:[0-9]{2}\b" logfile.txt
# Hex colors
grep -E "#[0-9A-Fa-f]{6}\b" style.css
5. Recursive Search
Searching Directories
# -r: Recursive search grep -r "pattern" /path/to/directory # -R: Recursive with follow symlinks grep -R "pattern" /etc # --include: Only search certain files grep -r --include="*.log" "error" /var/log/ # --exclude: Skip certain files grep -r --exclude="*.tmp" "pattern" . # --exclude-dir: Skip directories grep -r --exclude-dir=".git" "pattern" .
Complex Recursive Examples
# Search only Python files
grep -r --include="*.py" "def " .
# Search multiple file types
grep -r --include="*.{js,html,css}" "TODO" .
# Exclude multiple patterns
grep -r --exclude="*.{log,tmp,backup}" "error" .
# Search but ignore version control
grep -r --exclude-dir={.git,.svn,.hg} "pattern" .
# Find files containing pattern and show filenames
grep -rl "main" --include="*.rs" src/
6. Practical Examples
Log File Analysis
# Count error types grep -c "ERROR" app.log grep -c "WARNING" app.log # Show errors with timestamps grep -n "ERROR" app.log # Find errors in last hour grep "$(date -d '1 hour ago' '+%Y-%m-%d %H')" app.log | grep "ERROR" # Get unique error messages grep "ERROR" app.log | cut -d']' -f2- | sort | uniq -c | sort -rn # Show errors by hour grep "ERROR" app.log | cut -c1-13 | sort | uniq -c # Find patterns across multiple logs grep -h "Exception" /var/log/app/*.log | sort | uniq
Code Search
# Find function definitions
grep -r "^def " --include="*.py" .
# Find TODO comments
grep -r -n "TODO\|FIXME" --include="*.{rs,py,js}" .
# Find usage of specific function
grep -r "function_name(" --include="*.c" .
# Find imports
grep -r "^import " --include="*.py" .
# Find string literals
grep -r "\".*\"" --include="*.rs" .
# Find commented code
grep -r "^\s*//.*[a-zA-Z]" --include="*.rs" .
System Administration
# Find processes ps aux | grep "nginx" # Check listening ports netstat -tlnp | grep ":80" # Find users logged in who | grep "user" # Check disk usage df -h | grep "^/dev" # Find in system logs sudo grep "Failed password" /var/log/auth.log # Check configuration files grep -r "^server_name" /etc/nginx/sites-enabled/
Data Extraction
# Extract IP addresses from access log
grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" access.log | sort | uniq -c
# Extract email addresses
grep -oE "[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}" contacts.txt
# Extract URLs
grep -oE "https?://[^\s]+" webpage.html
# Extract numbers
grep -oE "[0-9]+" file.txt
# Extract quoted strings
grep -oE "\"[^\"]*\"" file.txt
7. Combining with Other Commands
Pipes and Redirection
# Count matches grep "pattern" file.txt | wc -l # Sort results grep "pattern" file.txt | sort | uniq # Filter further grep "error" log.txt | grep -v "timeout" # Save results grep "pattern" file.txt > matches.txt # Append results grep "pattern" file.txt >> all_matches.txt # Use with xargs grep -l "pattern" *.txt | xargs rm
Complex Pipelines
# Find most common IP addresses
grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" access.log |
sort |
uniq -c |
sort -rn |
head -10
# Find files with most matches
grep -rc "TODO" . |
grep -v ":0$" |
sort -t: -k2 -rn |
head -10
# Analyze error patterns
grep "ERROR" app.log |
awk '{print $5}' |
sort |
uniq -c |
sort -rn |
column -t
8. Performance Optimization
Efficient Searching
# Use -F for fixed strings (fastest) grep -F "exact.string" largefile.txt # Use -w for whole words grep -w "word" largefile.txt # Limit search context grep -m 100 "pattern" largefile.txt # Stop after 100 matches # Use LC_ALL for better performance LC_ALL=C grep "pattern" largefile.txt # Use parallel grep with xargs find . -name "*.log" -print0 | xargs -0 -P 4 grep "pattern"
Handling Large Files
# Process in chunks split -l 10000 largefile.txt chunk_ grep "pattern" chunk_* > results.txt # Use parallel processing parallel -j4 grep "pattern" ::: file1.txt file2.txt file3.txt # Memory-efficient searching grep -F -f patterns.txt hugefile.txt # Stream processing tail -f logfile.txt | grep "pattern"
9. Colorized Output
Using Colors
# --color: Highlight matches grep --color=auto "pattern" file.txt grep --color=always "pattern" file.txt grep --color=never "pattern" file.txt # Customize colors (GREP_COLORS) export GREP_COLORS='ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36' # ms=01;31 # Match (bold red) # fn=35 # Filename (purple) # ln=32 # Line number (green)
Examples with Colors
$ grep --color=always -n "error" logfile.txt # Shows "error" in red with line numbers in green
10. Script Examples
Log Monitor Script
#!/bin/bash
# monitor_log.sh - Monitor log file for patterns
LOG_FILE="$1"
PATTERNS="${2:-ERROR WARNING FATAL}"
SLEEP_INTERVAL=5
monitor() {
echo "Monitoring $LOG_FILE for patterns: $PATTERNS"
echo "Press Ctrl+C to stop"
tail -f "$LOG_FILE" | while read line; do
for pattern in $PATTERNS; do
if echo "$line" | grep -q "$pattern"; then
echo "$(date): Found $pattern in log"
echo "$line"
# Alert if critical
if [[ "$pattern" == "FATAL" ]]; then
echo "CRITICAL ERROR DETECTED!"
# Send alert (email, slack, etc.)
fi
break
fi
done
done
}
monitor
Code Analyzer Script
#!/bin/bash
# analyze_code.sh - Analyze code for issues
analyze() {
local dir="${1:-.}"
echo "=== Code Analysis Report ==="
echo
# Find TODOs
echo "TODOs:"
grep -r -n "TODO" --include="*.{rs,py,js,go}" "$dir" |
sed 's/^/ /'
echo
# Find FIXMEs
echo "FIXMEs:"
grep -r -n "FIXME" --include="*.{rs,py,js,go}" "$dir" |
sed 's/^/ /'
echo
# Find debug prints
echo "Debug prints:"
grep -r -n "println\|print\|console.log" \
--include="*.{rs,py,js}" "$dir" |
grep -v "^\s*#" |
sed 's/^/ /'
echo
# Function count
echo "Function definitions:"
grep -r "^func\|^def\|^fn" --include="*.{go,py,rs}" "$dir" |
wc -l |
sed 's/^/ Total: /'
}
analyze "$1"
Pattern Search with Context
#!/bin/bash
# smart_grep.sh - Enhanced grep with context
smart_grep() {
local pattern="$1"
local file="$2"
local context="${3:-2}"
if [[ ! -f "$file" ]]; then
echo "File not found: $file"
return 1
fi
echo "Searching for: $pattern"
echo "File: $file"
echo "Context: $context lines"
echo "----------------------------------------"
grep -n -C "$context" --color=always "$pattern" "$file"
local matches=$(grep -c "$pattern" "$file")
echo "----------------------------------------"
echo "Total matches: $matches"
}
smart_grep "$1" "$2" "$3"
11. Advanced Features
Binary File Search
# -a: Treat binary files as text
grep -a "pattern" binary.dat
# -I: Ignore binary files
grep -I "pattern" *
# Search in binary files with hex output
grep -a -o -P "[\x20-\x7E]{4,}" binary.dat
Device and Special Files
# Search in device output grep "model name" /proc/cpuinfo # Monitor kernel messages dmesg | grep "USB" # Search process memory (requires root) sudo grep -a "pattern" /proc/$(pid)/mem
Network Usage
# Search in network streams curl -s http://example.com | grep "title" # Analyze network traffic tcpdump -A | grep "pattern" # SSH output search ssh user@host "grep 'pattern' /var/log/syslog"
12. Exit Codes and Conditional Usage
Exit Codes
# 0: Pattern found # 1: Pattern not found # 2: Error occurred grep -q "pattern" file.txt if [ $? -eq 0 ]; then echo "Pattern found" else echo "Pattern not found" fi # Using in conditions if grep -q "error" logfile.txt; then echo "Errors found" fi # Check if pattern exists in any file if grep -l "pattern" *.txt > /dev/null; then echo "Pattern exists in some files" fi
Conditional Processing
# Process only if pattern exists grep "pattern" file.txt && process_matches.sh # Stop if pattern not found grep "pattern" file.txt || exit 1 # Different actions based on match count count=$(grep -c "pattern" file.txt) case $count in 0) echo "No matches" ;; [1-9]) echo "Few matches: $count" ;; *) echo "Many matches: $count" ;; esac
13. grep Variants
Different grep Versions
# grep - standard grep grep "pattern" file.txt # egrep - extended grep (same as grep -E) egrep "pattern1|pattern2" file.txt # fgrep - fixed string grep (same as grep -F) fgrep "literal.string" file.txt # rgrep - recursive grep (same as grep -r) rgrep "pattern" . # zgrep - grep compressed files zgrep "pattern" file.gz # bzgrep - grep bzip2 files bzgrep "pattern" file.bz2 # xzgrep - grep xz files xzgrep "pattern" file.xz
14. Integration with Development Tools
IDE-like Features
# Find and replace across files grep -rl "old" . | xargs sed -i 's/old/new/g' # List files with matches and line counts grep -rc "function" . | grep -v ":0$" # Show context with file names grep -Hn -C 2 "TODO" *.rs # Search and open in editor grep -n "pattern" file.txt | cut -d: -f1 | xargs vim
Version Control Integration
# Search in git repository git grep "pattern" # Search in specific commit git grep "pattern" HEAD~10 # Search in all branches git grep "pattern" $(git rev-list --all) # Search in staged changes git grep --cached "pattern"
15. Performance Metrics
Benchmarking
# Time grep operations time grep "pattern" largefile.txt # Compare different grep variants time grep -F "pattern" largefile.txt time grep -w "pattern" largefile.txt time LC_ALL=C grep "pattern" largefile.txt # Test with different buffer sizes time grep --buffer-size=4096 "pattern" largefile.txt time grep --buffer-size=1048576 "pattern" largefile.txt
Profiling
# Use strace to see system calls strace -c grep "pattern" largefile.txt # Monitor memory usage /usr/bin/time -v grep "pattern" largefile.txt # Profile with perf perf stat grep "pattern" largefile.txt
16. Common Patterns and Recipes
Network and Security
# Find suspicious SSH attempts
grep "Failed password" /var/log/auth.log | grep -oE "([0-9]{1,3}\.){3}[0-9]{1,3}"
# Check for port scans
grep "SYN" /var/log/firewall.log
# Find exposed secrets
grep -r "password\|secret\|key" --include="*.{env,json,yml}" .
# Check SSL certificate expiration
echo | openssl s_client -connect example.com:443 2>/dev/null |
openssl x509 -noout -dates | grep "notAfter"
Data Analysis
# Find most common words
grep -oE "[a-zA-Z]+" file.txt | sort | uniq -c | sort -rn | head -20
# Analyze CSV data
grep "2024-03" data.csv | cut -d',' -f3 | sort | uniq -c
# Find outliers
grep -E "^[0-9]+$" numbers.txt | awk '$1 > 1000'
# Pattern frequency over time
grep -o "ERROR" app.log |
uniq -c |
awk '{print NR " " $1}' > frequency.dat
System Health
# Monitor CPU temperature sensors | grep "Core" # Check disk health sudo smartctl -a /dev/sda | grep "Reallocated_Sector_Ct" # Monitor memory usage free -h | grep "Mem" # Check for zombie processes ps aux | grep "Z" # Find large files find / -type f -size +100M 2>/dev/null | xargs ls -lh | grep -E "[0-9]+G" | sort -rh
17. Troubleshooting
Common Issues
# Problem: Pattern with special characters # Solution: Use -F for fixed strings or escape grep -F "a+b*c" file.txt grep "a\+b\*c" file.txt # Problem: Binary file output garbled # Solution: Use -a or -I grep -a "text" binary.dat # Problem: Too many matches # Solution: Limit output grep -m 100 "pattern" largefile.txt | less # Problem: Case sensitivity issues # Solution: Use -i for case-insensitive grep -i "Pattern" file.txt
Debugging grep
# Verbose output grep -v "pattern" file.txt # Invert match # Show non-matching lines with context grep -v "pattern" file.txt | head -20 # Test pattern before using echo "test string" | grep "pattern" # Debug regex with visual tool echo "test string" | grep --color=always "t.st"
18. Best Practices
Performance Tips
# Use fixed strings when possible grep -F "literal.string" file.txt # Limit search scope grep "pattern" specific/directory/ # Use LC_ALL for speed LC_ALL=C grep "pattern" file.txt # Avoid unnecessary pipes # Bad: cat file.txt | grep "pattern" # Good: grep "pattern" file.txt # Use -q for boolean checks if grep -q "pattern" file.txt; then # do something fi
Readability Tips
# Use long options in scripts
grep --recursive --line-number --ignore-case "pattern" .
# Group options logically
grep -rni "pattern" .
# Comment complex patterns
# Email pattern:
grep -E "[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}" contacts.txt
# Use variables for patterns
pattern="error|warning|fatal"
grep -E "$pattern" logfile.txt
19. Quick Reference Card
Most Common Options
| Option | Description |
|---|---|
-i | Ignore case |
-v | Invert match |
-w | Whole word match |
-n | Show line numbers |
-c | Count matches |
-l | List filenames only |
-r | Recursive search |
-E | Extended regex |
-F | Fixed strings |
-o | Only matching part |
-A NUM | After context |
-B NUM | Before context |
-C NUM | Context lines |
--color | Highlight matches |
Common Combinations
| Command | Purpose |
|---|---|
grep -i "pattern" file | Case-insensitive search |
grep -r "pattern" . | Recursive search |
grep -n "pattern" file | Show line numbers |
grep -v "pattern" file | Invert match |
grep -c "pattern" file | Count matches |
grep -l "pattern" *.txt | Show filenames |
grep -E "pattern1|pattern2" | Multiple patterns |
grep -A 2 -B 2 "pattern" | Show context |
grep -oE "[0-9]+" file | Extract numbers |
Conclusion
The grep command is an indispensable tool for text processing and pattern matching:
Key Points Summary
- Basic Operations:
- Search for patterns in files
- Case-insensitive search with
-i - Whole word matching with
-w - Line numbers with
-n
- Regular Expressions:
- Basic regex (
.,*,[],^,$) - Extended regex with
-E(+,?,|,{}) - Perl-compatible with
-P(lookahead,\d,\s)
- Advanced Features:
- Context lines (
-A,-B,-C) - Recursive search (
-r,-R) - File filtering (
--include,--exclude) - Multiple patterns (
-e,-f)
- Performance:
- Use
-Ffor literal strings - Limit search scope
- Use
LC_ALL=Cfor speed - Stop after N matches with
-m
Best Practices
- Choose the right options for your use case
- Test complex patterns before using on large files
- Use quotes around patterns with special characters
- Combine with other commands for powerful pipelines
- Consider performance for large-scale searches
- Document complex regex patterns for maintainability
Quick Reference
| Want to… | Command |
|---|---|
| Find "error" in file | grep "error" file.txt |
| Find case-insensitive | grep -i "error" file.txt |
| Count occurrences | grep -c "error" file.txt |
| Show line numbers | grep -n "error" file.txt |
| Search recursively | grep -r "error" . |
| Show context | grep -C 2 "error" file.txt |
| Multiple patterns | grep -E "error|warning" file.txt |
| Invert match | grep -v "error" file.txt |
| Whole words only | grep -w "error" file.txt |
| List matching files | grep -l "error" *.txt |
The grep command's power and flexibility make it an essential tool for anyone working with text files, logs, or code. Mastering grep will significantly enhance your command-line productivity and data processing capabilities.