Shell Script Hack #5: Master sed One-Liners for Instant Text Processing

Shell Script Hack #5: Master sed One-Liners for Instant Text Processing

You’ve written a script that needs to extract specific data from text files. You reach for Python or Perl, but wait – sed can do it in one line, instantly, without external dependencies. Let me show you the power of sed one-liners.

The Problem

You need to process a 2GB log file to extract error messages. Loading it into a text editor crashes your system. Writing a Python script feels like overkill. You need something fast and simple.

# Traditional approach - slow and memory intensive
python -c "
with open('huge.log') as f:
    for line in f:
        if 'ERROR' in line:
            print(line)
"

The Hack: sed One-Liners

sed (Stream EDitor) processes text line by line, using minimal memory regardless of file size:

sed -n '/ERROR/p' huge.log

That’s it! One line, instant results, handles gigabytes effortlessly.

Essential sed One-Liners

Find and Replace

# Replace first occurrence per line
sed 's/old/new/' file.txt

# Replace all occurrences (g flag)
sed 's/old/new/g' file.txt

# Replace in-place (modify the file)
sed -i 's/old/new/g' file.txt

# Create backup before modifying
sed -i.bak 's/old/new/g' file.txt

Delete Lines

# Delete blank lines
sed '/^$/d' file.txt

# Delete lines containing pattern
sed '/DEBUG/d' app.log

# Delete lines NOT containing pattern
sed '/ERROR/!d' app.log

# Delete first line
sed '1d' file.txt

# Delete last line
sed '$d' file.txt

# Delete lines 5-10
sed '5,10d' file.txt

Extract Specific Lines

# Print only lines containing pattern
sed -n '/ERROR/p' log.txt

# Print lines 10-20
sed -n '10,20p' file.txt

# Print every 5th line
sed -n '1~5p' file.txt

# Print first 100 lines (faster than head)
sed -n '1,100p' file.txt

# Print last 50 lines (not as good as tail for this)
sed -n '$ -49,$p' file.txt

Advanced sed Techniques

Multiple Operations

# Chain operations with -e
sed -e 's/foo/bar/g' -e 's/baz/qux/g' file.txt

# Or use semicolons
sed 's/foo/bar/g; s/baz/qux/g' file.txt

Using Regular Expressions

# Remove all numbers
sed 's/[0-9]//g' file.txt

# Extract email addresses
sed -n 's/.*\([a-zA-Z0-9._%+-]*@[a-zA-Z0-9.-]*\.[a-zA-Z]\{2,\}\).*/\1/p' file.txt

# Replace multiple spaces with single space
sed 's/  */ /g' file.txt

# Remove leading whitespace
sed 's/^[ \t]*//' file.txt

# Remove trailing whitespace
sed 's/[ \t]*$//' file.txt

Insert and Append

# Insert line before match
sed '/pattern/i\New line above' file.txt

# Append line after match
sed '/pattern/a\New line below' file.txt

# Insert at line 1
sed '1i\#!/bin/bash' script.sh

# Append at end of file
sed '$a\THE END' file.txt

Real-World Use Cases

Log File Analysis

# Extract all error timestamps
sed -n 's/.*\[\([0-9-]* [0-9:]*\)\].*ERROR.*/\1/p' app.log

# Count errors by hour
sed -n 's/.*\[\([0-9-]* [0-9]*\):[0-9:]*\].*ERROR.*/\1/p' app.log | sort | uniq -c

# Remove sensitive data
sed 's/password=[^&]*/password=REDACTED/g' access.log

Configuration File Editing

# Change database host
sed -i 's/host=localhost/host=prod-db.example.com/' config.ini

# Uncomment a line
sed -i 's/^#\(max_connections\)/\1/' postgresql.conf

# Comment out a line
sed -i 's/^\(debug_mode\)/#\1/' app.conf

# Update port number
sed -i 's/port=[0-9]*/port=8080/' server.conf

CSV/TSV Processing

# Change delimiter from comma to tab
sed 's/,/\t/g' data.csv

# Extract specific column (3rd column)
sed 's/[^,]*,[^,]*,\([^,]*\).*/\1/' data.csv

# Remove header row
sed '1d' data.csv

# Add quotes around fields
sed 's/\([^,]*\)/"\1"/g' data.csv

HTML/XML Manipulation

# Remove all HTML tags
sed 's/<[^>]*>//g' page.html

# Extract URLs from HTML
sed -n 's/.*href="\([^"]*\)".*/\1/p' page.html

# Change XML attribute
sed 's/version="1.0"/version="2.0"/g' data.xml

Performance Comparison

Processing a 1GB log file to extract errors:

  • Python script: 45 seconds, 2GB RAM
  • grep: 12 seconds, 4MB RAM
  • sed: 8 seconds, 2MB RAM

sed wins on both speed and memory efficiency!

Combining sed with Other Tools

With find

# Replace text in all .conf files
find /etc -name "*.conf" -exec sed -i 's/old/new/g' {} \;

# Or better, with xargs
find /etc -name "*.conf" | xargs sed -i 's/old/new/g'

With pipes

# Clean up ps output
ps aux | sed '1d' | sed 's/  */ /g'

# Extract specific fields from command output
docker ps | sed '1d' | sed 's/\s\+/ /g' | cut -d' ' -f1,2

With curl

# Download and process on the fly
curl -s https://example.com/data.txt | sed 's/foo/bar/g' > processed.txt

Advanced Patterns

Address Ranges

# Delete from pattern to end
sed '/START_DELETE/,$d' file.txt

# Process between patterns
sed -n '/BEGIN/,/END/p' file.txt

# Replace only between line 10 and 20
sed '10,20s/old/new/g' file.txt

Backreferences

# Swap first two words
sed 's/\([^ ]*\) \([^ ]*\)/\2 \1/' file.txt

# Add parentheses around numbers
sed 's/\([0-9]\+\)/(\1)/g' file.txt

# Duplicate lines
sed 'p' file.txt

Conditional Operations

# Replace only if line contains pattern
sed '/pattern/s/old/new/g' file.txt

# Replace except in lines with pattern
sed '/pattern/!s/old/new/g' file.txt

Common Pitfalls and Solutions

Special Characters

# Wrong - / in replacement
sed 's/path/\/usr\/local\/bin/' file.txt

# Right - use different delimiter
sed 's|path|/usr/local/bin|' file.txt
sed 's:path:/usr/local/bin:' file.txt

In-Place Editing on macOS

# Linux
sed -i 's/old/new/' file.txt

# macOS (requires extension for backup)
sed -i '' 's/old/new/' file.txt
sed -i.bak 's/old/new/' file.txt

Greedy vs Non-Greedy

# Greedy (removes everything)
echo "keep" | sed 's/<.*>//'
# Output: (empty)

# Non-greedy (keeps content)
echo "keep" | sed 's/<[^>]*>//g'
# Output: keep

Debugging sed Commands

# Test without modifying file
sed 's/old/new/g' file.txt | less

# Show line numbers
sed -n '/pattern/=' file.txt

# Show pattern matches with line numbers
sed -n '/pattern/{=;p;}' file.txt

Pro Tips

  • Test first: Always test without -i before modifying files
  • Use different delimiters: When paths are involved, use | or :
  • Combine with grep: Filter first, then sed for better performance
  • Keep it simple: For complex operations, consider awk or Python
  • Backup important files: Use -i.bak for safety

Complete Example: Log Anonymizer

#!/bin/bash

# Anonymize log files for sharing
anonymize_log() {
    local input=$1
    local output=$2
    
    sed -e 's/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/XXX.XXX.XXX.XXX/g' \
        -e 's/password=[^&]*/password=REDACTED/g' \
        -e 's/token=[^&]*/token=REDACTED/g' \
        -e 's/[a-zA-Z0-9._%+-]\+@[a-zA-Z0-9.-]\+\.[a-zA-Z]\{2,\}/user@example.com/g' \
        -e 's/"username":"[^"]*"/"username":"anonymous"/g' \
        "$input" > "$output"
    
    echo "Anonymized log created: $output"
}

anonymize_log production.log safe-to-share.log

When to Use sed vs Other Tools

Use sed when:

  • You need line-by-line text transformation
  • Working with large files
  • Simple find-and-replace operations
  • Deleting or extracting specific lines
  • In-place file editing

Use awk when:

  • Processing columnar data
  • Need arithmetic operations
  • Complex field manipulation

Use grep when:

  • Just searching, not transforming
  • Need regex matching only

Conclusion

sed one-liners are the Swiss Army knife of text processing. They’re fast, memory-efficient, and available on every Unix-like system. Master a handful of sed patterns, and you’ll handle 90% of your text processing needs without reaching for heavier tools.

Start simple, build your sed toolkit gradually, and soon you’ll be processing gigabytes of data with elegant one-liners!

References

Written by:

426 Posts

View All Posts
Follow Me :
How to whitelist website on AdBlocker?

How to whitelist website on AdBlocker?

  1. 1 Click on the AdBlock Plus icon on the top right corner of your browser
  2. 2 Click on "Enabled on this site" from the AdBlock Plus option
  3. 3 Refresh the page and start browsing the site