Text Processing Power Tools
One of the greatest strengths of Linux is its rich collection of text-processing utilities. Combined with pipes, they let you transform, search, filter, and analyze text data with incredible efficiency.
grep โ Search for Patterns
grep (Global Regular Expression Print) searches for patterns in files or input and prints matching lines.
# Basic search
grep "error" logfile.txt
# Case-insensitive
grep -i "error" logfile.txt
# Recursive search in directories
grep -r "TODO" ./src/
# Invert match (lines that do NOT match)
grep -v "debug" logfile.txt
# Count matches
grep -c "error" logfile.txt
# Show line numbers
grep -n "error" logfile.txt
# Use regex
grep -E "error|warning|fatal" logfile.txtgrep -E (or egrep) for extended regular expressions. It supports +, ?, |, and () without needing to escape them.
sed โ Stream Editor
sed performs text transformations on an input stream. The most common use is find-and-replace.
# Substitute first occurrence on each line
sed 's/old/new/' file.txt
# Substitute ALL occurrences on each line (global)
sed 's/old/new/g' file.txt
# Edit the file in place
sed -i 's/old/new/g' file.txt
# Delete lines matching a pattern
sed '/pattern/d' file.txt
# Delete line 5
sed '5d' file.txt
# Insert text before line 3
sed '3i\New line of text' file.txt
# Print only lines 10-20
sed -n '10,20p' file.txtawk โ Pattern Processing Language
awk is a full-fledged text-processing language. It excels at working with columnar data.
# Print specific columns (space-separated by default)
awk '{print $1, $3}' data.txt
# Custom delimiter
awk -F: '{print $1, $7}' /etc/passwd
# Pattern matching
awk '/error/ {print $0}' logfile.txt
# BEGIN and END blocks
awk 'BEGIN {print "Name\tScore"} {print $1"\t"$2} END {print "Done"}' scores.txt
# Sum a column
awk '{sum += $2} END {print "Total:", sum}' sales.txt
# Conditional processing
awk '$3 > 90 {print $1, "passed with", $3}' grades.txtOther Essential Text Tools
# cut โ extract columns by delimiter or character position
cut -d: -f1 /etc/passwd # first field, colon-delimited
cut -c1-10 file.txt # first 10 characters of each line
# sort โ sort lines
sort file.txt # alphabetical
sort -n numbers.txt # numeric sort
sort -r file.txt # reverse sort
sort -t: -k3 -n /etc/passwd # sort by 3rd field, numeric
# uniq โ remove adjacent duplicates (sort first!)
sort file.txt | uniq # unique lines
sort file.txt | uniq -c # count occurrences
sort file.txt | uniq -d # show only duplicates
# wc โ word/line/character count
wc file.txt # lines, words, chars
wc -l file.txt # just line count
# tr โ translate/delete characters
echo "hello" | tr 'a-z' 'A-Z' # HELLO
echo "hello world" | tr -s ' ' # squeeze spacesuniq only removes adjacent duplicates. Always pipe through sort first if your data isn't already sorted.
Try It Yourself
Summary
You've explored the big three text-processing tools โ grep for searching, sed for substitution, and awk for columnar data โ plus supporting utilities like cut, sort, uniq, wc, and tr. These tools, combined with pipes, give you immense power over text data.