Complete Guide to Bash cut Command

Introduction to cut Command

The cut command is a powerful text processing utility in Unix/Linux that extracts sections from each line of input. It's commonly used for parsing structured data like CSV files, log files, and command output. The command works by selecting columns based on delimiters, character positions, or byte positions.

Key Concepts

  • Field-based cutting: Extract based on delimiters (like commas, tabs)
  • Character-based cutting: Extract specific character positions
  • Byte-based cutting: Extract specific byte positions
  • Complement: Select everything EXCEPT the specified sections
  • Output delimiter: Specify how to join extracted sections

1. Basic cut Syntax

Command Structure

# Basic syntax
cut OPTION [FILE]
# Common options
cut -f FIELD_LIST       # Select fields (delimited text)
cut -c CHARACTER_LIST   # Select characters
cut -b BYTE_LIST        # Select bytes
cut -d DELIMITER        # Specify delimiter (default: TAB)
cut -s                  # Only print lines containing delimiter
cut --complement        # Select complement of specified fields
cut --output-delimiter=STRING  # Specify output delimiter

Simple Examples

# Cut characters (positions 1-5 from each line)
echo "Hello World" | cut -c 1-5
# Output: Hello
# Cut fields (first field using TAB delimiter)
echo -e "John\t30\tNYC" | cut -f 1
# Output: John
# Cut with custom delimiter
echo "John,30,NYC" | cut -d ',' -f 2
# Output: 30

2. Field-Based Cutting (-f)

Basic Field Selection

# Sample data file (data.txt)
cat data.txt
# John,30,NYC,Engineer
# Alice,25,LA,Designer
# Bob,35,Chicago,Manager
# Select single field
cut -d ',' -f 1 data.txt
# John
# Alice
# Bob
# Select multiple fields
cut -d ',' -f 1,3 data.txt
# John,NYC
# Alice,LA
# Bob,Chicago
# Select range of fields
cut -d ',' -f 2-4 data.txt
# 30,NYC,Engineer
# 25,LA,Designer
# 35,Chicago,Manager
# Select from field to end
cut -d ',' -f 2- data.txt
# 30,NYC,Engineer
# 25,LA,Designer
# 35,Chicago,Manager
# Select up to a field
cut -d ',' -f -3 data.txt
# John,30,NYC
# Alice,25,LA
# Bob,35,Chicago

Complex Field Selection

# Combine ranges and individual fields
cut -d ',' -f 1,3-4 data.txt
# John,NYC,Engineer
# Alice,LA,Designer
# Bob,Chicago,Manager
# Select non-consecutive fields
cut -d ',' -f 1,4 data.txt
# John,Engineer
# Alice,Designer
# Bob,Manager
# Using with process substitution
ps aux | cut -d ' ' -f 1,2 | head -5

Handling TAB-Delimited Files

# TAB is the default delimiter
cat tab_data.txt
# John    30      NYC
# Alice   25      LA
# Bob     35      Chicago
# No need for -d with TAB
cut -f 1,2 tab_data.txt
# John    30
# Alice   25
# Bob     35
# Change output delimiter for TAB
cut -f 1,2 --output-delimiter=',' tab_data.txt
# John,30
# Alice,25
# Bob,35

3. Character-Based Cutting (-c)

Character Position Selection

# Sample file (text.txt)
cat text.txt
# Hello World
# Bash Scripting
# Cut Command
# Select specific characters
cut -c 1 text.txt
# H
# B
# C
# Select range of characters
cut -c 1-5 text.txt
# Hello
# Bash 
# Cut C
# Select multiple ranges
cut -c 1-5,7-10 text.txt
# HelloWorl
# BashScri
# Cut Comm
# Select from position to end
cut -c 3- text.txt
# llo World
# sh Scripting
# t Command
# Select up to position
cut -c -5 text.txt
# Hello
# Bash 
# Cut C

Fixed-Width Data Processing

# Fixed-width file (fixed.txt)
cat fixed.txt
# John      30    NYC
# Alice     25    LA
# Bob       35    Chicago
# Extract based on character positions
cut -c 1-10 fixed.txt   # Name field
# John      
# Alice     
# Bob       
cut -c 11-15 fixed.txt  # Age field
# 30  
# 25  
# 35  
# Combine with trim
cut -c 1-10 fixed.txt | sed 's/ *$//'
# John
# Alice
# Bob

4. Byte-Based Cutting (-b)

Byte Position Selection

# Note: -b is similar to -c for ASCII, different for multibyte
# ASCII file (all single-byte)
echo "Hello" | cut -b 1-3
# Hel
# With multibyte characters (UTF-8)
echo "café" | cut -b 1-4
# café
# -c handles multibyte correctly
echo "café" | cut -c 1-4
# café
# Difference becomes apparent with some characters
echo "🚀 Rocket" | cut -b 1-4
# (may produce partial character)
echo "🚀 Rocket" | cut -c 1-4
# 🚀 R

5. Working with Delimiters (-d)

Custom Delimiters

# CSV files
echo "John,Doe,30,Engineer" | cut -d ',' -f 2
# Doe
# Colon-delimited (like /etc/passwd)
cut -d ':' -f 1,3 /etc/passwd | head -3
# root:0
# daemon:1
# bin:2
# Space-delimited (but careful with multiple spaces)
echo "John   Doe   30" | cut -d ' ' -f 2
# (empty because of multiple spaces)
# Better for spaces: use tr to squeeze spaces
echo "John   Doe   30" | tr -s ' ' | cut -d ' ' -f 2
# Doe
# Pipe-delimited
echo "John|Doe|30|Engineer" | cut -d '|' -f 1-3
# John|Doe|30

Multiple Delimiters

# cut doesn't support multiple delimiters directly
# Use tr to convert delimiters first
# File with mixed delimiters
cat mixed.txt
# John,30;NYC|Engineer
# Alice;25,LA|Designer
# Convert all to same delimiter
tr ',;|' '\t' < mixed.txt | cut -f 1,3
# John    NYC
# Alice   LA
# Using multiple delimiters with tr
echo "John:30,NYC;Engineer" | tr ':,' '\t' | cut -f 1,3
# John    NYC

6. Complement Selection (--complement)

Inverse Selection

# Select everything EXCEPT specified fields
echo "John,30,NYC,Engineer" | cut -d ',' --complement -f 2
# John,NYC,Engineer
# Remove first field
echo "John,30,NYC" | cut -d ',' --complement -f 1
# 30,NYC
# Remove multiple fields
echo "John,30,NYC,Engineer" | cut -d ',' --complement -f 1,4
# 30,NYC
# Remove range
echo "John,30,NYC,Engineer" | cut -d ',' --complement -f 2-3
# John,Engineer
# Character-based complement
echo "Hello World" | cut --complement -c 1-5
#  World

7. Output Delimiter

Changing Output Format

# Change output delimiter
echo "John,30,NYC" | cut -d ',' -f 1,3 --output-delimiter=':'
# John:NYC
# Multiple fields with custom delimiter
cut -d ':' -f 1,3 /etc/passwd --output-delimiter=' -> ' | head -3
# root -> 0
# daemon -> 1
# bin -> 2
# Create CSV from TAB-delimited
cut -f 1,2 tab_data.txt --output-delimiter=','
# John,30
# Alice,25
# Bob,35
# Using with different delimiters
echo "John 30 NYC" | tr ' ' '\t' | cut -f 1,3 --output-delimiter='|'
# John|NYC

8. Practical Examples

System Administration

# Extract usernames from /etc/passwd
cut -d ':' -f 1 /etc/passwd | sort
# Get UIDs of regular users (UID >= 1000)
cut -d ':' -f 1,3 /etc/passwd | grep ':[0-9]\{4,\}' | cut -d ':' -f 1
# Extract IP addresses from ifconfig
ifconfig | grep 'inet ' | cut -d ' ' -f 10 | grep -v '127.0.0.1'
# Get list of logged-in users
who | cut -d ' ' -f 1 | sort -u
# Extract process names
ps aux | tail -n +2 | cut -d ' ' -f 11- | head -5
# Get disk usage by mount point
df -h | tail -n +2 | cut -c 1-50 | head -5

Log File Analysis

# Extract IP addresses from Apache log
cat access.log | cut -d ' ' -f 1 | sort | uniq -c | sort -nr
# Extract timestamps from log
cat app.log | cut -d ' ' -f 1-2 | head -5
# Get HTTP status codes from access log
cat access.log | cut -d '"' -f 3 | cut -d ' ' -f 2 | sort | uniq -c
# Extract specific fields from syslog
grep "sshd" /var/log/auth.log | cut -d ' ' -f 1-3,5- | head -5
# Parse CSV log
cat logs.csv | cut -d ',' -f 1,3,5 --output-delimiter=' | '

Data Processing

# Extract columns from CSV
cat data.csv | cut -d ',' -f 2,4 | sort | uniq -c
# Get first and last fields
echo "a:b:c:d:e" | cut -d ':' -f 1,5
# a:e
# Extract phone numbers from contact list
grep "Phone:" contacts.txt | cut -d ':' -f 2 | sed 's/^ //'
# Parse key-value pairs
echo "key1=value1&key2=value2&key3=value3" | tr '&' '\n' | cut -d '=' -f 2
# Extract domain from email list
cat emails.txt | cut -d '@' -f 2 | sort -u

File Information

# Get file extensions
ls -1 | grep '\\.' | cut -d '.' -f 2- | sort -u
# Extract file sizes from ls -l
ls -l | tail -n +2 | cut -d ' ' -f 5 | numfmt --to=iec
# Get file permissions
ls -l | tail -n +2 | cut -d ' ' -f 1
# Extract modification dates
ls -l | tail -n +2 | cut -c 40-60

9. Combining cut with Other Commands

With grep

# Search then cut
grep "ERROR" app.log | cut -d ' ' -f 1-4
# Cut then search
cut -d ',' -f 2 data.csv | grep "pattern"
# Multiple filters
grep -v "^#" config.conf | cut -d '=' -f 1 | grep -v "^$"

With sort and uniq

# Count occurrences
cut -d ',' -f 3 data.csv | sort | uniq -c | sort -nr
# Unique values
cut -d ':' -f 1 /etc/passwd | sort -u
# Top N values
cut -d ' ' -f 1 access.log | sort | uniq -c | sort -nr | head -10

With awk and sed

# Pre-process with sed before cut
sed 's/  */ /g' file.txt | cut -d ' ' -f 2
# Post-process with awk
cut -d ',' -f 2,4 data.csv | awk -F ',' '{print $2 ":" $1}'
# Complex pipeline
cat data.txt | 
grep -v "^#" | 
cut -d '|' -f 2,5 | 
sed 's/|/,/g' | 
sort -u

With xargs

# Extract and use as arguments
cut -d ':' -f 1 /etc/passwd | head -5 | xargs echo "Users:"
# Delete files listed in a file
cut -d ',' -f 1 files.csv | xargs rm -i
# Process each line
cut -d ',' -f 2 data.csv | xargs -I {} echo "Processing: {}"

10. Advanced Techniques

Multi-Character Delimiters

# cut doesn't support multi-char delimiters directly
# Use sed or awk instead
# File with "||" delimiter
echo "John||30||NYC" | sed 's/||/\t/g' | cut -f 2
# 30
# Using awk for multi-char delimiters
echo "John||30||NYC" | awk -F '\\|\\|' '{print $2}'
# 30
# Complex delimiters with perl
echo "John::30::NYC" | perl -F'::' -lane 'print $F[1]'
# 30

Handling Quoted Fields

# CSV with quoted fields (cut has limitations)
echo '"John, Doe",30,"New York"' | 
awk -F ',' '{
for(i=1;i<=NF;i++) {
if($i ~ /^"/) {
while($i !~ /"$/) {
i++
$1 = $1 "," $i
}
}
print $i
}
}'
# Better: use csvkit or proper CSV parser
# csvcut -c 2 file.csv

Variable Width Fields

# Fixed width with cut
cut -c 1-20,30-40 file.txt
# Variable width with awk
awk '{print substr($0,1,20) substr($0,30,10)}' file.txt
# Combining with expansion
cut -c 1-$(tput cols) /var/log/syslog | head -5

Dynamic Field Selection

# Use variables for field numbers
field=3
cut -d ',' -f $field data.csv
# Calculate field positions
start=5
end=10
cut -c ${start}-${end} file.txt
# Programmatic field selection
for i in 1 3 5; do
cut -d ',' -f $i data.csv
done | paste -d ',' - - -
# Select based on content
grep -n "pattern" file.txt | cut -d ':' -f 1 | xargs -I {} sed -n '{}p' file.txt

11. Error Handling and Edge Cases

Missing Fields

# Lines with missing fields
cat inconsistent.txt
# John,30,NYC
# Alice,25
# Bob
# -s option suppresses lines without delimiter
cut -d ',' -f 2 -s inconsistent.txt
# 30
# 25
# Without -s, prints entire line if delimiter missing
cut -d ',' -f 2 inconsistent.txt
# 30
# 25
# Bob  (entire line printed)

Empty Fields

# CSV with empty fields
cat empties.csv
# John,,30,NYC
# Alice,25,,LA
# Empty fields are preserved
cut -d ',' -f 2 empties.csv
# (empty line)
# 25
# Count empty fields
cut -d ',' -f 2 empties.csv | grep -c "^$"
# 1

Leading/Trailing Delimiters

# Line starting with delimiter
echo ",30,NYC" | cut -d ',' -f 2
# 30
# Line ending with delimiter
echo "John,30," | cut -d ',' -f 3
# (empty)
# Multiple consecutive delimiters
echo "John,,30,NYC" | cut -d ',' -f 2
# (empty)

12. Performance Considerations

Large File Processing

# For large files, cut is very efficient
time cut -d ',' -f 2 hugefile.csv > output.txt
# Compare with awk
time awk -F ',' '{print $2}' hugefile.csv > output.txt
# cut is generally faster than awk for simple field extraction
# Use LC_ALL=C for ASCII files (faster)
LC_ALL=C cut -d ',' -f 2 hugefile.csv
# Process in chunks for very large files
split -l 1000000 hugefile.csv chunk_
for f in chunk_*; do
cut -d ',' -f 2 "$f" >> output.txt &
done
wait
rm chunk_*

Memory Usage

# cut streams data, minimal memory usage
# Monitor memory
/usr/bin/time -v cut -d ',' -f 2 hugefile.csv > /dev/null
# Piping large data
tar -cf - bigdir/ | cut -c 1-100 | head -5
# cut processes stream without loading entire file

13. Script Examples

CSV Processor

#!/bin/bash
# Process CSV file with headers
process_csv() {
local file="$1"
local field="$2"
# Get header
header=$(head -1 "$file" | cut -d ',' -f "$field")
# Get data
echo "Processing field: $header"
tail -n +2 "$file" | cut -d ',' -f "$field" | sort | uniq -c
}
# Extract specific columns with validation
extract_columns() {
local file="$1"
local columns="$2"
local delimiter="${3:-,}"
# Validate file exists
if [ ! -f "$file" ]; then
echo "Error: File not found" >&2
return 1
fi
# Get column count from header
local num_cols=$(head -1 "$file" | tr -cd "$delimiter" | wc -c)
num_cols=$((num_cols + 1))
# Validate columns
for col in $(echo "$columns" | tr ',' ' '); do
if [ "$col" -gt "$num_cols" ]; then
echo "Error: Column $col exceeds file columns ($num_cols)" >&2
return 1
fi
done
# Extract columns
cut -d "$delimiter" -f "$columns" "$file"
}
# Usage
# extract_columns data.csv 1,3,5

Log Analyzer

#!/bin/bash
# Apache log analyzer
analyze_apache_log() {
local logfile="$1"
echo "=== Apache Log Analysis ==="
# Top IPs
echo -e "\nTop IP addresses:"
cut -d ' ' -f 1 "$logfile" | sort | uniq -c | sort -nr | head -5
# Top pages
echo -e "\nTop requested pages:"
cut -d '"' -f 2 "$logfile" | cut -d ' ' -f 2 | sort | uniq -c | sort -nr | head -5
# HTTP status codes
echo -e "\nStatus codes:"
cut -d '"' -f 3 "$logfile" | cut -d ' ' -f 2 | sort | uniq -c | sort -nr
# Traffic by hour
echo -e "\nTraffic by hour:"
cut -d '[' -f 2 "$logfile" | cut -d ':' -f 2 | sort | uniq -c
}
# Usage
analyze_apache_log /var/log/apache2/access.log

Data Extractor

#!/bin/bash
# Flexible data extraction tool
extract_data() {
local file="$1"
local format="$2"  # csv, tsv, fixed, custom
local spec="$3"    # field numbers, ranges, etc.
case "$format" in
csv)
cut -d ',' -f "$spec" "$file"
;;
tsv)
cut -f "$spec" "$file"
;;
fixed)
# Convert spec like "1-10,20-30" to cut format
cut -c "$spec" "$file"
;;
custom)
local delim="$4"
cut -d "$delim" -f "$spec" "$file"
;;
*)
echo "Unknown format: $format" >&2
return 1
;;
esac
}
# Process multiple files
extract_from_files() {
local pattern="$1"
local fields="$2"
for file in $pattern; do
if [ -f "$file" ]; then
echo "=== $file ==="
cut -d ',' -f "$fields" "$file" | head -3
fi
done
}
# Usage
# extract_data data.csv csv 1,3,5
# extract_from_files "*.log" "1,2"

14. Common Use Cases Reference

Quick Reference Table

TaskCommand
Get first field from CSVcut -d ',' -f 1 file.csv
Get username from /etc/passwdcut -d ':' -f 1 /etc/passwd
Extract IP from logcut -d ' ' -f 1 access.log
Get first 10 characterscut -c 1-10 file.txt
Remove first fieldcut --complement -d ',' -f 1
Get last fieldrev file.txt | cut -d ',' -f 1 | rev
Change delimitercut -d ',' -f 1,3 --output-delimiter=':'
Skip lines without delimitercut -d ',' -s -f 2 file.txt
Extract column rangecut -f 2-5 tab_data.txt
Multiple rangescut -c 1-10,20-30 file.txt

Real-World Examples

# System information
# Get list of users with shells
cut -d ':' -f 1,7 /etc/passwd | grep -v "nologin\|false" | cut -d ':' -f 1
# Get running services
systemctl list-units --type=service --all | cut -c 1-50 | grep -v "^$"
# Monitor disk space by partition
df -h | tail -n +2 | cut -c 1-30,40-50
# Network connections
netstat -tulpn | tail -n +2 | cut -c 20-70 | head -10
# Process memory usage
ps aux | tail -n +2 | cut -c 1-20,40-60 | head -5
# Extract email domains
grep -E -o "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b" emails.txt | cut -d '@' -f 2 | sort -u

15. Limitations and Alternatives

cut Limitations

# 1. No support for multi-character delimiters
echo "John||30||NYC" | cut -d '||' -f 2  # Doesn't work
# Solution: Use awk
echo "John||30||NYC" | awk -F '\\|\\|' '{print $2}'
# 2. No regex support
echo "John123Doe" | cut -c 1-4  # Can't extract based on pattern
# Solution: Use grep -o
echo "John123Doe" | grep -o '[A-Za-z]*'
# 3. Can't reorder fields
echo "a,b,c" | cut -d ',' -f 3,1  # Still prints a,c
# Solution: Use awk
echo "a,b,c" | awk -F ',' '{print $3 "," $1}'
# 4. No conditional extraction
echo "John,30,NYC" | cut -d ',' -f 2  # Can't filter based on value
# Solution: Use awk
echo "John,30,NYC" | awk -F ',' '$2 > 25 {print $0}'

Alternative Tools

# awk - Most flexible
awk -F ',' '{print $1, $3}' data.csv
# sed - Good for simple extractions
sed 's/^\([^,]*\),.*/\1/' data.csv
# perl - Full regex power
perl -F',' -lane 'print $F[0]' data.csv
# grep - Pattern-based extraction
grep -o '^[^,]*' data.csv
# tr + cut combination
tr ' ' '\t' < file.txt | cut -f 2
# column - Format text
column -t -s ',' data.csv
# csvkit - For serious CSV work
csvcut -c 1,3 data.csv
csvgrep -c 2 -m "pattern" data.csv

When to Use What

# Use cut when:
# - Simple field extraction from delimited files
# - Fixed-width character extraction
# - Performance is critical
# - Processing very large files
# Use awk when:
# - Need field reordering
# - Complex conditions
# - Calculations on fields
# - Multi-character delimiters
# Use sed when:
# - Pattern-based substitution
# - Line-by-line transformations
# - Simple text manipulation
# Use perl when:
# - Complex regex operations
# - Need maximum flexibility
# - Processing non-standard formats

Conclusion

The cut command is a fundamental tool for text processing in Unix/Linux:

Key Takeaways

  1. Three modes: Field (-f), character (-c), byte (-b) cutting
  2. Default delimiter: TAB for fields
  3. Custom delimiters: Use -d for any single character
  4. Complement: --complement to exclude sections
  5. Output formatting: --output-delimiter to change separator
  6. Skip lines: -s to ignore lines without delimiter

Best Practices

ScenarioRecommendation
CSV filesUse -d ',' with proper quoting handling
TSV filesUse default TAB delimiter
Fixed-widthUse -c with character ranges
Large filescut is fastest option
Complex parsingConsider awk instead
Multi-char delimitersUse awk or perl

Quick Reference Card

# Field extraction
cut -d',' -f1 file.csv        # First field
cut -d',' -f1,3 file.csv      # First and third
cut -d',' -f2-5 file.csv      # Fields 2 through 5
cut -d',' -f-3 file.csv       # First three fields
cut -d',' -f3- file.csv       # From field 3 to end
# Character extraction
cut -c1-10 file.txt           # First 10 chars
cut -c5,10,15 file.txt        # Specific positions
cut -c1-10,20-30 file.txt     # Multiple ranges
# Output control
cut -d',' -f1,3 --output-delimiter=':' file.csv
cut -d',' --complement -f2 file.csv
# Skip lines
cut -d',' -s -f2 file.csv     # Skip lines without comma

The cut command is essential for quick data extraction and processing. Master it for efficient command-line text manipulation!

Leave a Reply

Your email address will not be published. Required fields are marked *


Macro Nepal Helper