Operation and Maintenance
Linux Operation and Maintenance
How to use the awk command in Linux? (Pattern Scanning Examples)
How to use the awk command in Linux? (Pattern Scanning Examples)
awk是Linux中强大的文本处理工具,按行扫描并匹配模式后执行操作,适用于日志、CSV等结构化数据的提取、转换与汇总。

awk is a powerful text-processing tool in Linux that scans lines for patterns and performs actions when matches occur. It’s especially useful for extracting, transforming, and summarizing structured data like logs, CSVs, or command output.
Basic syntax and field addressing
awk reads input line by line and splits each line into fields (by default, separated by whitespace). Fields are referenced as $1, $2, etc., and the whole line is $0.
- Print the first field of every line: awk '{print $1}' file.txt
- Print lines where the third field equals "ERROR": awk '$3 == "ERROR" {print}' log.txt
- Print lines with more than 5 fields: awk 'NF > 5' data.csv
Matching patterns with regular expressions
You can use slashes (/.../) to define regex patterns. They’re matched against the entire line ($0) unless specified otherwise.
- Find lines containing an IP address: awk '/[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/' access.log
- Match lines starting with "WARN" and print the timestamp (field 4): awk '/^WARN/ {print $4}' app.log
- Case-insensitive match for "fail": awk 'tolower($0) ~ /fail/' errors.log
Using BEGIN and END blocks for setup and summary
BEGIN runs before any input is read; END runs after all input is processed — ideal for headers, counters, or totals.
- Add a header and count lines: awk 'BEGIN {print "USER\tCOUNT"} {users[$1]++} END {for (u in users) print u, users[u]}' auth.log
- Sum the second column and show average: awk '{sum += $2; n++} END {print "Total:", sum, "Avg:", sum/n}' sales.txt
Practical log analysis example
Suppose you have an Apache-style log and want to list top 5 client IPs with request count:
- Extract IPs (first field), count occurrences, sort descending, limit to 5: awk '{print $1}' access.log | sort | uniq -c | sort -nr | head -5
- Do it entirely in awk (more efficient): awk '{ips[$1]++} END {for (ip in ips) print ips[ip], ip}' access.log | sort -nr | head -5
Basically just pattern + action — keep it simple, test on small samples, and remember that awk shines when you combine filtering, field extraction, and lightweight computation in one pass.
The above is the detailed content of How to use the awk command in Linux? (Pattern Scanning Examples). For more information, please follow other related articles on the PHP Chinese website!
Hot AI Tools
Undress AI Tool
Undress images for free
AI Clothes Remover
Online AI tool for removing clothes from photos.
Undresser.AI Undress
AI-powered app for creating realistic nude photos
ArtGPT
AI image generator for creative art from text prompts.
Stock Market GPT
AI powered investment research for smarter decisions
Hot Article
Popular tool
Notepad++7.3.1
Easy-to-use and free code editor
SublimeText3 Chinese version
Chinese version, very easy to use
Zend Studio 13.0.1
Powerful PHP integrated development environment
Dreamweaver CS6
Visual web development tools
SublimeText3 Mac version
God-level code editing software (SublimeText3)
Hot Topics
20516
7
13631
4
How to manage software packages using Dnf and Rpm in Linux?
Mar 07, 2026 am 01:50 AM
dnfinstall prompts packagenotfound because it queries the enabled warehouse rather than the local installation status; rpm-q queries the local RPM database, and the scopes of the two are different.
How to monitor system performance and resources in Linux? (Top & Htop)
Mar 06, 2026 am 01:04 AM
The total top CPU usage is not 100% because it is calculated based on a single core, and the maximum is 800% for an 8-core system; the htop terminal size error needs to be fixed with eval$(resize); the real memory pressure is available rather than used; top command line truncation can be solved with the -c parameter.
How to setup file sharing using Samba on Linux? (SMB Protocol)
Mar 13, 2026 am 12:33 AM
The main reason why Windows cannot see the Samba share is that the firewall blocks UDP137–139/TCP445 or NetBIOS name resolution fails; it is necessary to confirm that the workgroup is consistent, the interfaces are configured correctly, the file permissions match forceuser/forcegroup, and set doscharset=UTF-8 to solve Chinese garbled characters.
How to extend a Logical Volume (LVM) in Linux without downtime?
Mar 13, 2026 am 12:53 AM
Logical volumes and file systems can be expanded online. You need to expand the LV first and then the file system. It is recommended to use lvextend-r for automatic synchronization adjustment, but you must ensure that the LVM and file system tool versions are compatible.
How to check open ports and listening services in Linux? (Netstat & SS)
Mar 10, 2026 am 01:08 AM
Netstat displays fewer LISTEN ports than ss because it does not display process information by default that non-root users do not have access to; ss can read all listening sockets by default without process names, and sudonetstat-tulpn is required to display them completely.
How to set up SSH key authentication on Linux? (Passwordless Login)
Mar 11, 2026 am 12:46 AM
It is recommended to use ssh-keygen-ted25519 to generate a key pair, because it is faster, more secure, and has a shorter key than the default RSA; it is necessary to strictly set the ~/.ssh directory permissions to 700 and authorized_keys to 600, and use ssh-v to confirm whether the client is Offering public key and whether the server rejects it.
How to format disk partitions using the command line in Linux?
Mar 15, 2026 am 12:01 AM
When fdisk is stuck at the Command prompt, it is normally waiting for input. Enter q to exit safely; you must umount before mkfs, otherwise it may fail silently; partedmkpart does not support the specified file system type, and mkfs needs to be executed separately.
How to configure a static IP address on Linux? (Netplan & NetworkManager)
Mar 14, 2026 am 12:02 AM
Netplan reports "InvalidYAML" when configuring a static IP due to indentation errors, missing spaces after colons, or mixed tabs; gateway4 has been deprecated and routes to:default must be used instead; NetworkManager needs to be modified before down/up takes effect; the renderer field is used to determine the backend during coexistence; incorrect DNS configuration will cause ping to succeed but curl to fail.





