In this section, we will take a look at text processing awk and sed. They are useful for advanced uses of command line, who want to use regular expressions in Linux for tasks such as transforming and analyzing data efficiently for logs, reports, and more.
Summary
AWK and SED are powerful Linux tools for text manipulation. This chapter covers advanced AWK arrays, SED regex, combining both tools, and real-world applications, empowering beginners to process text like pros.
Learning Objectives: Learn advanced AWK and SED techniques, combine them for complex tasks, and apply best practices for efficient text processing.
Use of Text Processing with AWK and SED?
AWK and SED streamline data extraction, transformation, and analysis, essential for log parsing, data cleaning, and automation in Linux workflows.
Review of Basic AWK and SED
- AWK: Processes structured data:
$ awk '{print $1}' file.txt - SED: Edits text streams:
$ sed 's/old/new/g' file.txt
Advanced AWK Topics
Using AWK Arrays
Count occurrences in a file:
$ awk '{count[$1]++} END {for (i in count) print i, count[i]}' file.txt
Working with Multiple Files
Compare two files:
$ awk 'NR==FNR{a[$1];next} $1 in a' file1.txt file2.txt
Advanced SED Topics
SED Regular Expressions
Replace digits with asterisks:
$ sed 's/[0-9]/*/g' file.txt
Working with Multiple Lines
Delete blank lines:
$ sed '/^$/d' file.txt
Using AWK and SED Together
Extract and format data:
$ awk '{print $1,$2}' file.txt | sed 's/ /,/g' > output.csv
Real-World Examples
Data Processing
Convert log to CSV:
$ awk '{print $1","$3}' access.log | sed 's/ /,/g' > report.csv
Log File Analysis
Count unique IPs in logs:
$ awk '{print $1}' access.log | sort | uniq -c | awk '{print $2,$1}'
Best Practices and Tips
- Use single quotes to avoid shell interpolation.
- Test commands on sample data first.
- Optimize AWK with
nextto skip unnecessary processing. - Comment complex SED scripts:
# Replace digits.
Practical Examples
Filter and format logs:
$ awk '$2=="ERROR" {print $1,$3}' error.log | sed 's/ /: /g'
Extract specific columns:
$ awk -F',' '{print $2}' data.csv | sed 's/^ *//g'
Practice Time!
Test your skills:
1. Count word frequencies with AWK.
2. Replace text patterns with SED.
3. Combine AWK and SED to process a log file. 4. Analyze a CSV file for specific data.
Try This: Run
awk '{print $1}' file.txt | sed 's/a/b/g'and share your success on X with #LinuxCommandLine!
AWK and SED Command Reference
| Command | Description |
|---|---|
awk '{print $1}' |
Prints first column. |
sed 's/old/new/g' |
Replaces text globally. |
awk -F',' |
Sets field separator. |
sed '/^$/d' |
Deletes empty lines. |
Practice Time!
Test your skills:
- Find email addresses with
grep. - Replace “foo” with “bar” using
sed. - Print second CSV column with
awk. - Search recursively with
ripgrep.
Try This: Run
grep -E "error|warning" logfile.txtand share results on X with #LinuxCommandLine!
Glossary of Commands, Tools, and Shortcuts
Reference: For detailed documentation, visit Linux Manpages. For package installation, search on Debian APT.
| Command/Tool | Description |
|---|---|
| grep | Searches text for regex patterns. |
| egrep | Extended grep for advanced regex. |
| sed | Stream editor for text transformations. |
| awk | Processes structured text with regex. |
| ripgrep | Faster alternative to grep. |
| sd | Modern search-and-replace tool. |
| jq | Processes JSON data. |
That’s it for this chapter 15, regular expressions in Linux ! You’ve now learned how to use regular expressions with grep, sed, and awk to search, filter, and transform text. In the next chapter, we’ll dive into text processing—using tools like cut, sort, uniq, and wc to manipulate text files. Until then, practice using regex to become more comfortable with its powerful capabilities.
Previous: Chapter 13 | Next: Chapter 15