Shell text filtering (Regular Expression) classification: Linux Shell script learning 213 people read comments (0) collect reports
When extracting or filtering text from a file or command output, you can use a regular expression (r e), which is a set of special or not special string modes.
^ Match only the beginning of a row
$ Only matches the end of a row
* Only one single character followed by *, matching 0 or more single characters
[] Only matches the characters in. It can be a single character or a character sequence. You can use-to represent the range of character sequences in []. For example, use [1-5] instead of [1 2 3 4 5].
\ Is used only to block the special meaning of a metacharacter. Because sometimes some metacharacters in s h e l contain
Special meaning. \ Can make it meaningless
. Only match any single character
P a t e r n \ {n \} is only used to match the occurrence times of p a t e r n. N is the number of times
P a t e r n \ {n, \} m only means the same as above, but the minimum number of times is n
P a t e r n \ {n, m \} Only means the same as above, but the occurrence times of p a t e r n are between N and M.
Period "." matches a single character
1). match any single ASCII character. It can be a letter or a number.
2) Example: Match dexc1t, 23 xcdf, and match rwxrw-RW-
Match strings or character sequences with ^ at the beginning of the line
1) ^: Allows matching of characters or words at the beginning of a line.
2) Example: ^. 01 matches 0011cx4 and c01sdf, ^ d matches drwxr-XR-X, drw-r --, and so on.
Match strings or characters with $ at the end of a line
1) $: match strings or characters at the end of a row, and put the $ symbol after matching words.
2) Example: trouble $ matches all rows ending with the word trouble
^ $ Match all empty rows
Use * to match a single character in a string or its recurring series (different from "*" in file name replacement)
1) *: a single character followed by *, matching 0 or more.
2) Example: compu * t will match the character u once or multiple times, that is, match the computer computing compuuute and so on.
1033 * matches 101333, 10133, 1013444, and so on.
3) When "*" is used in a regular expression, unexpected results are sometimes generated.
Use \ to block the meaning of a special character
1) \: used to block the special meaning of a metacharacter. Sometimes, in shell, metacharacters have special meanings. \ Can make it meaningless.
2) Example: match all files ending with *. PAS in a regular expression: \ * \. Pas $
Use [] to match a single character in a range or set
1) []: Match characters in. It can be a single character or a character sequence. You can use "-" to indicate the Character Sequence range in brackets,
For example, use [1-5] instead of [12345]. You can use commas (,) to separate the characters in brackets.
2) When the "^" symbol is directly dependent on "[", it means that the content in the "[]" is negative or does not match the content in the brackets "[]".
3) Example: [0-9] matches any number; [A-Z] matches any lowercase letter; [0-9a-za-z] matches any letter or number;
[C, C] omputer matches the computer and computer; [^ A-Za-Z] matches any non-letter character
The number of times the matching result is displayed in the "\ {\}" Match mode.
1) pattern \ {n \}: the matching pattern appears n times.
2) pattern \ {n ,\}: the matching pattern appears at least N times.
3) pattern \ {, m \}: the matching pattern can appear at most m times.
4) pattern \ {n, m \}: Match pattern where the number of occurrences is between N and M.
5) Example: A \ {2 \} B matches AAB.
A \ {2, \} B can match AAB or aaaaab, but cannot match AB.
A \ {2, 4 \} B can match AAB, aaab, and aaaab, but cannot match AB or aaaaab.
[0-9] \ {4 \} CX [0-9] \ {4 \} matched digits appear four times followed by CX, followed by four digits
6) The actual format is {n} {n ,}{, m} {n, m }, only the esacpe character "\" is applied to "{" and "\".
Examples of Regular Expressions
[Ss] igna [ll] matches the words signal, and signal.
[Ss] igna [ll] \. Same as above, but add a sentence
^ User $ contains only user rows
\. Line with periods
^ D. x directory with executable permissions for users, user groups, other users, and group members
^ [^ L] exclude the file directory list after the symbolic link file (that is, it is not a line starting with "L)
[Yynn] uppercase or lowercase y or N
^. * $ Match any string in the row
^... $ Contains 6 Characters of rows
[A-Za-Z] any single letter
[A-Z] * at least one lowercase letter
[^ 0-9 \ $] non-digit or dollar sign
[123] a number from 1 to 3
\ ^ Q start with ^ Q
^. $ Rows with only one character
^ \. [0-9] [0-9] rows starting with a period and two numbers
[0-9] \ {2 \}-[0-9] \ {2 \}-[0-9] \ {4 \}
Date format DD-mm-yyyy
[0-9] \ {3 \}\. [0-9] \ {3 \}\. [0-9] \ {3 \}\. [0-9] \ {3 \} type IP address format
Nnn. NNN
. * Match any number of characters