1. Regular expression

1. Type of matching characters

[a-z]: lowercase letters
[A-Z]: Uppercase letters
[a-Z]: Small or uppercase letters
[0-9]: Numbers
[a-zA-Z0-9]: Matches a character that is a letter or number
. : Matches any character, except spaces
[0-f]: Hexadecimal number
abc | def: abc or def
a (bc | de) f: abcf or adef
\<: The first word is usually separated by spaces or special characters, and the continuous string is regarded as the word
\>: Word ending
[^expression]: All characters except lowercase letters, and so on.

2, followed by the following symbols to control the number of matches

The left side of such symbols must have the expression of the first point above

Expression*: 0 or n characters
Expression+: 1 or n characters
Expression?: 0 or 1 characters
Expression {n}: n characters
Expression {n:m}: n to m Characters
Expression {n,}: at least n characters

[Example] [a-z]* means matching 0 Or multiple lowercase letters

3. Control the matching characters at the beginning and end

^ Expression: The head matches
Expression$: The tail matches

2. Three major Linux text processing tools

1. egrep filtering tool

Extended version of grep, you can use regular expressions

Syntax:

egrep - option 'regular expression' file name

Options:

-n: Display line number
##-o: Display only matching content
-q: Silent mode, no output, you have to use $? to judge whether the execution is successful, that is, whether the desired content is filtered
- l: If the match is successful, only the file name will be printed. If it fails, it will not be printed. Usually -rl is used together, grep -rl 'root' /etc
-A: If it matches If successful, the matching line and the following n lines will be printed together
-B: If the match is successful, the matching line and the first n lines will be printed out
-C: If the match is successful, print out the matching line and n lines before and after it
--color
-c: If the match is successful, print out the number of matched lines
-i: Ignore case
- v: Negate, do not match
-w: Match the word

2, sed stream editor

Grammar:

Syntax 1: sed - option 'numeric positioning + command' file name

Option:

-n: Silent mode, no output
-e: Multiple edits, this is not very clear
-i: Direct modification File content instead of output
-r: Extended mode, you can use regular expressions
-f: Specify the file name, the action Write in a new file

Positioning:

① Numeric positioning (input line number positioning)

1: Single line
1,3: Range from the first line to the third line
2 ,+4: Several lines after the matching line
4,~3: From the fourth line to the next multiple of 3
2 ~3: Every three lines starting from the second line
$: The last line
1!: Lines other than the first line

【Example】sed -n '1p' /etc/passwd

②Regular expression positioning

Regular expressions must be wrapped with //
Expanding regular expressions requires the -r parameter or escaping
Replace sub-patterns that can use regular expressions, that is, parentheses (), \1 and \2 can represent sub-patterns

[Example] sed -r 's/ (.)(.)/\2\1/ file1 means to replace the first and second parts of the match

*Greedy option: fill in g, which means to replace all the matching parts in one line Matching item replacement

Command:

a: Append,
c ∶ Change change,
d ∶ Delete delete,
#i ∶ Insert, i can be followed by strings, and these strings Will appear on a new line (the current previous line)
p: print print
s: replace substitute, you can replace it directly work. Usually this s action can be paired with a regular expression. For example, 1,20s/old/new/g

*s command special instructions:

Use {Command 1: Command 2: Command 3} Multiple commands can be added

sCommand syntax: sed -r 'Replacement command s/regular expression/replacement content/greedy option g' File name

3, awk text analysis tool

Composed of commands, regular expressions (need to be surrounded by //), comparisons and relational operations

Use the -F parameter in option to define the interval symbol

Use the order of $1, $2, $3, etc. to represent the different fields in each column separated by spacers in each row of files. The NF variable represents the number of fields in the current record.

Syntax

awk - Option parameters 'Logical judgment {command variable 1, variable 2, variable 3}' File name

Option

-F Define field separator , the default delimiter is consecutive spaces or tabs
-v. Define variables and assign values. You can also use the borrowed method to introduce

from shell variables.

AWK variable

NR The number of current records (statistics after all files are connected)
FNR The number of current records (only statistics for the current file, not all)
FS field separator defaults to consecutive spaces or tabs, and multiple different symbols can be used for separation. Symbol -F[:/]
OFS The default separator for output characters is a space

[OFS example]

# awk -F: 'OFS="=====" {print $1,$2}' /etc/passwd
root===== x

NF The number of fields in the currently read row
ORS The output record separator defaults to newline

【ORS example】

# awk -F: 'ORS="=====" {print $1,$2}' /etc/ passwd
root x=====bin x=====

FILENAME Current file name

[Example 1] Using AWK variables
# awk '{print NR,FNR,$1}' file1 file2
1 1 aaaaa
2 2 bbbbb
3 3 ccccc
4 1 dddddd
5 2 eeeeee
6 3 ffffff

#[Example 2]How to quote shell variables

# a=root
# awk -v var=$a -F: '$1 == var {print $0}' /etc/passwd
Or split the entire command and pass it to expose the shell variables,
# awk -F: '$1 == "'$a'" {print $0}' /etc/passwd
# a=NF
# awk -F: '{print $'$a'}' /etc/passwd

Logical operations (can directly reference fields for operations)

= += -= / = *=: Assignment
&& || !: Logical and logical or logical non-
~ !~: Match regular or not match, Regular expressions need to be surrounded by /regular/
##< <= > >= != ==: relationship, when comparing strings, the strings must be enclosed in double quotes
$: Field references need to be added with $, while variable references are directly taken from variable names
+ - * / % ++ --: Operation Symbol