--"Bird's private Cuisine"
Formal representation is the method of dealing with strings, and he acts as a unit of behavior to deal with strings;
Formal representation through the assistance of some special symbols, can allow users to easily reach the "search/delete/replace" a specific string of processing procedures;
As long as the tool program supports the formal notation, the tool can be used as a string processing for formal notation;
The formal notation and the million bytes are completely different things! The universal byte (wildcard) represents a function of the bash operation interface, but the formal notation is a way to represent a string processing!
grep formal notation is a common and commonly used tool, and his most important function is to match the string data, and then print out the strings that meet the user's needs.
When grep searches for a string in the data, it retrieves the data in the "whole line" unit! That is, if there are 10 lines in a file, two of which have the string you are searching for, the two lines will be displayed on the screen, and the others will be discarded!
grep [-ACINV] [-A] [-b] [--color=auto] ' search string ' filename
Options and Parameters:
-A: binary files are searched for data in the form of a text file
-C: Calculates the number of ' search string ' found
-I: Ignores case differences, so case is considered the same
-N: Output line number by the way
-V: Reverse selection, which shows the line without the ' search string ' content
--color=auto: Add a color display to the Found keywords section
-A: The following can be added to the number, in the meaning of After, in addition to listing the row, the subsequent n rows are also listed
-B: The following can be added as a number, for the meaning of befer, in addition to listing the row, the previous n rows are also listed
Basic formal notation Character rounding
RE character |
Meaning and example |
^word |
Meaning: the string to be searched (word) at the beginning of the line! Example: Searching for the line starting at the beginning of lines, parallel travel number
grep-n ' ^# ' regular_express.txt
|
word$ |
Meaning: the string to be searched (word) at the end of the line! Example: End of line is! The ranks of the line printed out, parallel travel number
grep-n '!$ ' regular_express.txt
|
. |
Meaning: "Must have an arbitrary byte" character! Example: The search string can be (Eve) (Eae) (EEE) (e e), but not only (EE)! That is, E and E in the middle "must" only have one byte, and the blank byte is also a byte!
grep-n ' e.e ' regular_express.txt
|
\ |
Meaning: Skipping characters, removing special meanings of special symbols! Example: Search for the line containing the single quote '!
grep-n \ ' Regular_express.txt
|
* |
Meaning: Repeat 0 to infinitely multiple of the previous RE character Example: Find a String containing (es) (Ess) (ESSS) and so on, note that because * can be 0, ES is also compatible with the search string. In addition, because * is a repetition of the "previous re character" of the symbol, so, before the * must be connected to a RE-character Fu Yi! For example, any byte is ". *"!
grep-n ' ess* ' regular_express.txt
|
[List] |
Meaning: The byte collection of the RE character, which lists the bytes you want to retrieve! Example: Search for a line containing (GL) or (GD), it is necessary to pay special attention to [] in the "to represent a byte to be searched", such as "A[afl]y" for the search of the string can be Aay, Afy, Aly that [AFL] for a or F or l mean!
grep-n ' g[ld] ' regular_express.txt
|
[N1-N2] |
Meaning: The byte collection of the RE character, which lists the range of bytes you want to retrieve! Example: Search for the line that contains any number! Pay special attention to the minus sign in the byte set []-it's special, he represents all the contiguous bytes between two bytes! But this continuity is related to ASCII encoding, so your coding needs to be configured correctly (in bash, you need to determine if LANG and LANGUAGE variables are correct!) For example, all uppercase bytes are [a-z]
grep-n ' [A-z] ' Regular_express.txt
|
[^list] |
Meaning: The byte set of the RE character, which lists no strings or ranges! Example: The search string can be (Oog) (Ood) but not (oot), that ^ within [], the meaning of the representation is "reverse selection" meaning. For example, I don't want to capitalize bytes, then [^a-z]. However, it is important to note that if you search by Grep-n [^a-z] Regular_express.txt and find all the lines in the file are listed, why? Because this [^a-z] is a "non-uppercase byte" meaning, because each row has a non-uppercase byte, for example, the first line of "Open Source" has p,e,n,o .... And so on.
grep-n ' oo[^t] ' regular_express.txt
|
\{n,m\} |
Meaning: "Previous RE character" for successive N to M Meaning: If \{n\} is the previous RE character of a continuous n Meaning: If \{n,\} is a continuous n more than the previous RE character! Example: between G and G there are 2 to 3 o strings that exist, i.e. (GOOG) (Gooog)
grep-n ' go\{2,3\}g ' regular_express.txt
Note: because {with The symbol in the shell has special meaning, so we have to use the caret \ To make him lose his special meaning. |
Again, the "special byte of normal notation" is not the same as the "universal byte" that is normally entered in the command line, for example, the "0 ~ infinite number of bytes" is the meaning of the * in the universal byte, but in the formal notation, * is the meaning of "repeat 0 to infinity of the previous RE character"-The meaning of use Righteousness is not the same, do not confuse!
For example, LS, which does not support the formal notation, if we use "Ls-l *" to represent a file of any filename, and "ls-l A *" represents a file with any filename starting with a, but in the formal notation we need to find a file containing the beginning of a, it must be: ( Need to be paired with tools that support the formal notation)
ls | Grep-n ' ^a.* '
. * represents 0 or more arbitrary bytes
In addition, that ^ symbol, within the byte set symbol (parentheses []) is different from outside! In [] represents the "reverse selection", outside [] represents the meaning of positioning at the beginning of the line!
For example: Grep-n ' ^[^a-za-z] ' meaning is not to start with a line of English letters
In addition, in order to avoid the coding caused by the English and digital retrieval problem, so there are some special symbols we have to understand! These symbols mainly have the following meanings:
Special symbols |
Representative meaning |
[: Alnum:] |
Representation of English-size letters and numbers, i.e. 0-9, A-Z, A-Z |
[: Alpha:] |
Represents any English-to-large writing section, i.e. A-Z, A-Z |
[: Blank:] |
Represents both the blank key and the [Tab] key |
[: Cntrl:] |
Represents the control keys above the keyboard, which includes CR, LF, Tab, Del. Wait a minute |
[:d Igit:] |
Representing numbers, i.e. 0-9 |
[: Graph:] |
All other keys except blank bytes (blank key and [Tab] key) |
[: Lower:] |
Represents a small letter section, i.e. A-Z |
[:p rint:] |
Represents any byte that can be printed out |
[:p UNCT:] |
Represents the punctuation mark (punctuation symbol), i.e.: "'?!; : # $... |
[: Upper:] |
Represents uppercase bytes, i.e. A-Z |
[: Space:] |
Any bytes that generate whitespace, including blank keys, [Tab], CR, etc. |
[: Xdigit:] |
Represents a numeric type of 16, so includes: 0-9, a-f, a-f numbers and bytes |
"Shell" basic regular notation and grep usage