Regular expressions and three major Linux text processing tools

Source: Internet
Author: User
Tags print print egrep

First, the regular expression 1, the type of the matching character
    • [A-z]: lowercase letters
    • [A-z]: uppercase
    • [A-Z]: small or uppercase
    • [0-9]: Number
    • [A-za-z0-9]: Represents a character that matches a letter or number
    • . : Matches 1 arbitrary characters, except for spaces
    • [0-f]:16 Number of binary
    • ABC | DEF:ABC or Def
    • A (bc | de) F:ABCF or adef
    • \<: Word header words are usually separated by spaces or special characters, and successive strings are treated as words.
    • \>: End of Word
    • [^ expression]: All characters except lowercase letters, and so on.
2, followed by the following symbols to control the number of matches

The left side of this type of symbol has the expression 1th above

    • *:0 or n characters of an expression
    • +:1 or n characters of an expression
    • An expression?: 0 or 1 characters
    • Expression {n}:n characters
    • Expression {N:m}:n to M-characters
    • Expression {N,}: at least n characters

"Example" [a-z]* means matching 0 or more lowercase letters

3, the matching characters are controlled at the tail
    • ^ Expression: Head fits
    • Expression $: tail-compliant

Second, the Linux three big Text Processing tool 1, egrep Filter tool

The extended version of grep, you can use regular expressions

Syntax:

Egrep-option ' regular expression ' file name

Options:
    • -N: Show line numbers
    • -O: Show only matching content
    • -Q: Silent mode, no output, you have to use $? To determine the success of the execution, that is, there is no filtering to the desired content
    • -L: If the match succeeds, only the file name is printed, the failure is not printed, usually-rl together, grep-rl ' root '/etc
    • -A: If the match is successful, the matching row and the subsequent n rows are printed together
    • -B: If the match succeeds, the matching row and its first n rows are printed together
    • -C: If the match succeeds, the matching row and its n rows are printed together
    • --color
    • -C: If the match succeeds, the number of rows to match is printed
    • -I: Ignore case
    • -V: Inverse, mismatch
    • -W: Match word
2. Sed Stream Editor Syntax:

Syntax 1:sed-option ' Digital positioning + command ' file name

Options:
    • -N: Silent mode, no output
    • -E: Multiple edits, this is not very clear
    • -I: Modify the contents of the file directly, not the output
    • -R: Extended mode, regular expressions can be used
    • -F: Specify the file name and write the action in the new file
Positioning:

① Digital positioning (input row ordinal positioning)

    • 1: Single line
    • 1, 3: range from first line to third row
    • 2,+4: Matches rows after several rows
    • 4,~3: From line fourth to multiples of next 3
    • Two: Rows from the second line with three rows per interval
    • $: Last line
    • 1!: Except for lines other than the first line

"Example" sed-n ' 1p '/etc/passwd

② Regular expression positioning

    • The regular must be wrapped up in//
    • Extended regular requires the-R parameter or escape
    • Replace a sub-pattern that can use regular expressions, that is, parentheses (), to \1, \2 to represent sub-patterns

"Example" Sed-r ' s/(.) (.) /\2\1/file1 represents the first and second parts of the match to be replaced

* Greedy option: fill in G to replace all occurrences of a line

command:
    • A: Append append,
    • C: Changing the change,
    • D: Remove Delete,
    • I: Insert, I can be followed by a string, and these strings will appear on a new line (the current line)
    • P: Print Print
    • S: Replace the substitute, can be directly replaced work. Usually this s action can be paired with regular expressions. such as 1,20s/old/new/g

The *s command specifically explains:

Use {command 1: Command 2: Command 3} to increase the use of multiple commands

s command syntax: sed-r ' replace command s/regular expression/replace content/greedy option g ' File name

3. awk Text Analysis tool

Composed of commands, regular (need to surround with//), comparison, and relational operations.

Define the interval symbol using the-f parameter in option

Use the order of $1,$2,$3 to represent the different fields of each row in files separated by the interval symbol, and the NF variable represents the number of fields in the current record

Grammar

awk-Option parameter ' logical judgment ' {command variable 1, variable 2, variable 3} ' file name

Options
    • -F defines the field delimiter, and the default delimiter is a contiguous space or tab
    • -V Define variables and assign values can also be borrowed from the shell variable to introduce

awk variable

    • NR current Record count (all files after connection statistics)
    • FNR number of current records (only statistics for the current file, not all)
    • The FS field separates defaults as consecutive spaces or tabs, you can use several different symbols to do the delimiter-f[:/]
    • OFS the delimiter of the output character is a space by default

"OFS Case"

# awk-f: ' ofs= ' ===== ' {print $1,$2} '/etc/passwd
Root=====x

    • NF number of fields currently read
    • ORS output record delimiter default is line wrapping

"Ors Case"

# awk-f: ' ors= ' ===== ' {print $1,$2} '/etc/passwd
Root X=====bin x=====

    • FileName Current file name

"Example 1" uses the awk variable
# awk ' {print nr,fnr,$1} ' file1 file2
1 1 AAAAA
2 2 bbbbb
3 3 CCCCC
4 1 DDDDDD
5 2 Eeeeee
6 3 FFFFFF
#

Example 2 How to reference a shell variable

# A=root
# awk-v Var= $a-f: ' = = var {print $} '/etc/passwd
Or take the entire command apart and let the shell variable be exposed,
# awk-f: ' $ = = ' $a ' "{print $} '/etc/passwd
# A=NF
# awk-f: ' {print $ ' $a '} '/etc/passwd

logical operations (can directly refer to fields for operation)
    • = + = =/= *=: Assignment
    • && | | !: Logical AND logical OR logical non-
    • ~!~: Match regular or mismatched, regular need/regular/Surround
    • < <= > >=! = =: relationships, strings are quoted in double quotes when comparing strings
    • $: The field reference needs to be added $, and the variable reference is taken directly with the variable name
    • +-*/% + +--: operator
Escape Sequences
    • \ \ Self
    • \$ Escape $
    • \ t tab
    • \b Backspace
    • \ r return character
    • \ n line break
    • \c Cancel line break

 

For more detailed information, refer to:

Http://www.cnblogs.com/linhaifeng/p/6596660.html#_label3

Regular expressions and three major Linux text processing tools

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.