Regular Expressions and three Linux text processing tools, three linux
I. Regular Expression 1. Type of matching characters
- [A-z]: lowercase letter
- [A-Z]: uppercase letters
- [A-Z]: small or uppercase letters
- [0-9]: Number
- [A-zA-Z0-9]: a character that matches a letter or number
- .: Match 1 arbitrary character, except for spaces
- [0-f]: hexadecimal number
- Abc | def: abc or def
- A (bc | de) f: abcf or adef
- \ <: A word is generally separated by spaces or special characters. Consecutive strings are treated as words.
- \>: End of a word
- [^ Expression]: All characters except lowercase letters, and so on.
2. Use the following symbol to control the matching quantity.
The expression at the first point must be on the left side of the symbol.
- Expression *: 0 or n characters
- Expression +: 1 or n characters
- Expression? : 0 or 1 Character
- Expression {n}: n characters
- Expression {n: m}: n to m characters
- Expression {n ,}: at least n characters
[Example] [a-z] * indicates that 0 or more lower-case letters are matched.
3. Control matching characters at the beginning and end
- ^ Expression: the header matches
- Expression $: tail matches
Ii. Three Linux text processing tools 1. egrep filtering tools
Extended version of grep, which can use regular expressions
Syntax:
Egrep-option 'regular expression' file name
Option:
- -N: displays the row number.
- -O: only the Matching content is displayed.
- -Q: silent mode, no output, $? To determine whether the execution is successful, that is, whether the desired content is filtered.
- -L: If the match succeeds, only the file name is printed. If the match fails, the file name is not printed. Generally,-rl is used together. grep-rl 'root'/etc
- -A: If the match succeeds, the matching rows and the last n rows are printed together.
- -B: If the match is successful, the matching rows and the first n rows are printed together.
- -C: If the match is successful, the matching rows and the n rows before and after them are printed together.
- -- Color
- -C: If the matching succeeds, the number of matched rows is printed.
- -I: case insensitive
- -V: reverse. Mismatch
- -W: match words
2. sed stream Editor
Syntax:
Syntax 1: sed-option 'digit location + command 'file name
Option:
- -N: silent mode, no output
- -E: Multiple edits. This is not clear.
- -I: directly modify the file content instead of the output content.
- -R: Extended Mode. You can use a regular expression.
- -F: Specifies the file name and writes the action in the new file.
Positioning:
① Digital positioning (input row serial number positioning)
- 1: Single Row
- 1, 3: The range is from the first row to the third row.
- 2, + 4: match several rows
- 4 ,~ 3: multiple rows from the fourth row to the next 3
- 2 ~ 3: The rows at intervals of three rows starting from the second row
- $: Tail row
- 1! : Except the first line
[Example] sed-n '1p'/etc/passwd
② Regular Expression Positioning
- Regular Expressions must be wrapped in //
- To extend the regular expression, you must use the-r parameter or escape it.
- Replace the child mode that can use a regular expression, that is, Parentheses (). \ 1 and \ 2 can represent the child mode.
[Example] sed-r's/(.) (.)/\ 2 \ 1/file1 indicates replacing the first part and the second part.
* Greedy option: Fill in g to replace all matching items in a row.
Command:
- A: append,
- C: change,
- D: delete,
- I: insert. I can be followed by strings. These strings will appear in the new row (the previous row)
- P: print
- S: replace substitute and you can directly replace it. Generally, this s action can be combined with a regular expression. For example, 1, 20 s/old/new/g
* S command special instructions:
Use {command 1: Command 2: Command 3} to add multiple commands
S command syntax: sed-R' replace command s/regular expression/replace content/greedy option G' file name
3. awk Text Analysis Tool
It is a combination of commands, regular expressions (which need to be surrounded by //), comparison, and relational operations.
Use the-F parameter in option to define the delimiter
In the order of $1, $2, and $3, each line in files is separated by an interval symbol for different columns. The NF variable indicates the number of fields in the current record.
Syntax
Awk-option parameter 'logical judgment {command variable 1, variable 2, variable 3} 'file name
Option
- -F defines the field separator. The default Delimiter is consecutive spaces or tabs.
- -V defines variables and assigns values. You can also use the following method to introduce them from shell variables.
AWK variable
- Number of current records of NR (Statistics after all files are connected)
- Number of current records of FNR (only statistics of the current file, not all)
- The default delimiter of the FS Field is a consecutive space or tab. You can use multiple symbols as the separator-F [:/]
- The default delimiter of OFS output characters is space.
[OFS example]
# Awk-F: 'ofs = "====" {print $1, $2} '/etc/passwd
Root = x
- Number of fields in the row currently read by NF
- The default delimiter of ORS output records is line feed.
[ORS example]
# Awk-F: 'ors = "====" {print $1, $2} '/etc/passwd
Root x ==== bin x ====
- Current FILENAME file name
[Example 1] use the AWK variable
# Awk '{print NR, FNR, $1}' file1 file2
1 aaaaa
2 2 bbbbb
3 3 ccccc
4 1 dddddd
5 2 eeeeee
6 3 ffffff
#
[Example 2]Method for referencing shell Variables
# A = root
# Awk-v var = $ a-F: '$1 = var {print $0}'/etc/passwd
Or you can open and pass the entire command to expose the shell variable,
# Awk-F: '$1 = "' $ a'" {print $0} '/etc/passwd
# A = NF
# Awk-F: '{print $' $ a'} '/etc/passwd
Logical operation (operations can be performed by directly referencing a domain)
- = + =-=/= * =: Value assignment
- & |! : Logical and logical or non-logical
- ~ !~ : Match the regular expression or do not match the regular expression. The regular expression must be enclosed by/regular /.
- <<=>>=! ===: Link. When comparing strings, use double quotation marks
- $: $ Is required for field reference, and variable reference is directly obtained using the variable name.
- +-*/% + + --: Operator
Escape Sequence
- \\\ Itself
- \ $ Escape $
- \ T Tab
- \ B Return character
- \ R carriage return
- \ N linefeed
- \ C cancel line feed
Correct the error. For more details, refer:
Http://www.cnblogs.com/linhaifeng/p/6596660.html#_label3