grep and regular Expressions
Character class
Search for character classes: If I want to search for the two words of test or tast, I can find that they have a common ' t?st ' existence ~ this time, I can search for:
[[email protected] ~]# grep-n ' t[ae]st ' regular_express.txt8:i can ' t finish the test.9:oh! The soup taste good.
In fact [] in a few bytes, he would like to represent a "one" byte, so, the above example shows that the string I need is "tast" or "test" two string only!
Inverse selection of character classes [^]: If you want to search for a line that has OO, but do not want to have g in front of Oo, the following
[[email protected] ~]# grep-n ' [^g]oo ' regular_express.txt2:apple is my favorite Food.3:football game was not use feet onl Y.18:google is the best tools for search Keyword.19:goooooogle yes!
2nd, 3 lines no doubt, because Foo and foo can be accepted!
But the 18th line has Google's goo Ah ~ Don't forget, because the line behind the tool too ah! So the line is also listed-that is, 18 lines inside although we do not have the item (goo) but because of the need for the project (too), so it is in line with string search Oh!
As in line 19th, the same is true because goooooogle in front of OO may be o, for example: Go (ooo) Oogle, so this line is also in line with the demand!
Continuous character class: Again, suppose I don't want to have a small write section in front of Oo, so I can write [^abcd....z]oo, but this doesn't seem so convenient, because the order of encoding on the ASCII of lowercase bytes is sequential, so we can simplify it to the bottom:
[[email protected] ~]# grep-n ' [^a-z]oo ' Regular_express.txt3:Football game is isn't use feet only.
That is, when we are in a set of collection bytes, if the byte group is contiguous, such as uppercase English/lowercase english/numeric, etc., you can use [a-z],[a-z],[0-9] and other ways to write, then if our request string is the number and English? Oh! He wrote it all together and became: [A-za-z0-9].
We're going to get the line with numbers, and that's it:
[Email protected] ~]# grep-n ' [0-9] ' regular_express.txt5:However, this dress was about $3183 dollars.15:you was the BES T is mean 1.
Beginning of line with trailing bytes ^ $
Line beginning character: What if I want the only one listed at the beginning of the line? This is the time to use the anchor byte! We can do this:
[Email protected] ~]# grep-n ' ^the ' regular_express.txt12:the symbol ' * ' is represented as start.
At this point, only the 12th line is left, because only the start of line 12th is the beginning of Ah ~ Also, if I want to start with the small section of the line is listed? Can do this:
[[email protected] ~]# grep-n ' ^[a-z] ' regular_express.txt2:apple is my favorite food.4:this dress doesn ' t fit Me.10:moto Rcycle is cheap than car.12:the symbol ' * ' is represented as Start.18:google are the best tools for search Keyword.19:goooo Oogle yes!20:go! Go! Let ' s go.
If I don't want to start with an English letter, it can be:
[Email protected] ~]# grep-n ' ^[^a-za-z] ' regular_express.txt1: ' Open Source ' is a good mechanism to develop programs.21: # I AM Vbird
^ symbols, which are different from outside of the character class symbols (brackets [])! In [] represents the "reverse selection", outside [] represents the meaning of positioning at the beginning of the line!
So if I want to find out, the line ending at the end of the row is the decimal point (.):
[Email protected] ~]# grep-n ' \.$ ' regular_express.txt1: "Open Source" is a good mechanism to develop programs.2:apple are My favorite Food.3:football game isn't use feet only.4:this dress doesn ' t fit me.10:motorcycle is cheap than car.11:this window is clear.12:the symbol ' * ' is represented as Start.15:you was the best was mean you and the No. 1.16:the World < Happy> is the same with "glad". 17:i like Dog.18:google are the best tools for search keyword.20:go! Go! Let ' s go.
It is important to note that because the decimal point has other meanings (described below), you must use the escape character (\) to remove its special meaning!
Find the blank line:
[Email protected] ~]# grep-n ' ^$ ' regular_express.txt22:
Because only the beginning and the end of the line (^$), so that you can find the blank line!
Any one of the bytes. With repeating bytes *
The meanings of these two symbols in regular expressions are as follows:
. (decimal point): means "must have an arbitrary byte" meaning; * (asterisk): represents "repeating the previous character, 0 to infinity" meaning, for the combined form
Suppose I need to find out g?? A string of D, that is, a total of four bytes, beginning with G and ending with D, I can do this:
[Email protected] ~]# grep-n ' G.. d ' regular_express.txt1: "Open Source" is a good mechanism to develop programs.9:oh! The Soup taste good.16:the World <Happy> are the same with "glad".
Because the emphasis between G and D must exist two bytes, so the 13th line of God and the 14th line of GD will not be listed!
What if I want to list data that has Oo, OOO, oooo, and so on, which means at least two (including) O?
Because * represents the meaning of "repeating 0 or more of the preceding RE characters", "o*" stands for "having empty bytes or an O or more bytes", so "Grep-n ' o* ' Regular_express.txt" will print out all the data on the terminal!
When we need a "minimum of two o ' strings", we need ooo*, i.e.:
[Email protected] ~]# grep-n ' ooo* ' regular_express.txt1: "Open Source" is a good mechanism to develop programs.2:apple are My favorite Food.3:football game isn't use feet only.9:oh! The soup taste good.18:google is the best tools for search Keyword.19:goooooogle yes!
If I want the string to start with the end is G, but two g can only exist between at least one o, that is, Gog, Goog, Gooog .... Wait, what's that supposed to be?
[Email protected] ~]# grep-n ' goo*g ' Regular_express.txt18:google is the best tools for search Keyword.19:goooooogle Yes !
If I want to find the line where G begins and ends with G, the characters are optional
[Email protected] ~]# grep-n ' g.*g ' regular_express.txt1: "Open Source" is a good mechanism to develop programs.14:the GD Software is a library for drafting programs.18:google are the best tools for search Keyword.19:goooooogle yes!20:go! Go! Let ' s go.
Since it is the beginning of G and the end of G, any byte in the middle can be accepted, so the 1th, 14, 20 lines are acceptable Oh! this. * RE means that any character is very common.
What if I want to find the line of "any number"? Because there are only numbers, it becomes:
[[email protected] ~]# grep-n ' [0-9][0-9]* ' regular_express.txt5:However, this dress was about $3183 dollars.15:you was T He is mean-the No. 1.
Qualifying Continuous RE character range {}
We can use it. With the re character and * To configure 0 to infinitely multiple repeating bytes, then what if I want to limit the number of repetitions in a range of intervals?
For example, I want to find out two to five o continuous string, how to do? This is the time to use the qualified character {}. But since the symbol {and} is of special significance in the shell, we have to use the character \ To make him lose special meaning. The syntax for {} Is this, assuming I want to find two o strings, which can be:
[[email protected] ~]# grep-n ' o\{2\} ' regular_express.txt1: ' Open Source ' is a good mechanism to develop programs.2:apple Is my favorite Food.3:football game isn't use feet only.9:oh! The soup taste good.18:google is the best tools for search Ke19:goooooogle yes!
Let's say we're going to find out that G is followed by 2 to 5 O, then a string of G, and he will be like this:
[Email protected] ~]# grep-n ' go\{2,5\}g ' Regular_express.txt18:google are the best tools for search keyword.
What if I want 2 o ' goooo....g? In addition to being gooo*g, it can also be:
[Email protected] ~]# grep-n ' go\{2,\}g ' Regular_express.txt18:google are the best tools for search Keyword.19:goooooogle yes!
Extended grep (GREP-E or Egrep):
The main benefit of using the extended grep is the addition of additional regular expression meta-character sets.
Prints all rows that contain NW or EA. If you are not using Egrep, but grep, there will be no results detected.
# egrep ' nw| EA ' testfile Northwest NW Charles Main 3.0 . 98 3 Eastern EA TB Savage 4.4 . 5 20
For standard grep, if \,grep is preceded by an extension metacharacters, the extended option-E is automatically enabled.
#grep ' nw\| EA ' testfilenorthwest NW Charles Main 3.0 . 98 3 34eastern ea TB Savage 4.4 . 5 20
Searches for all rows that contain one or more 3.
# egrep ' testfile# grep-e ' testfile# grep ' 3\+ ' testfile #这3条命令将会northwest NW Charles Main 3.0
.98 3 34western WE Sharon Gray 5.3 . 5 23northeast NE AM Main Jr. 5.1 . 94 3 13central CT Ann Stephens 5.7 . 94 5 13
Searches for all rows that contain 0 or 1 decimal points characters.
# egrep ' 2\.? [0-9] ' testfile # grep-e ' 2\.? [0-9] ' testfile# grep ' 2\.\? [0-9] ' testfile #首先含有2字符, followed by 0 or 1 points, followed by a number between 0 and 9. Western WE Sharon Gray 5.3 . 5 23southwest SW Lewis dalsass 2.7 . 8 2 18eastern EA TB Savage 4.4 5 20
A row that searches for one or more contiguous no.
# egrep ' (NO) + ' testfile# grep-e ' (no) + ' testfile# grep ' \ (no\) \+ ' testfile #3个命令返回相同结果, Northwest NW Charles Main 3.0 98 3 34northeast NE AM Main Jr. 5.1 . 94 3 13north NO Margot Weber 4.5 . 5 9
Do not use regular expressions
The Fgrep query is faster than the grep command, but not flexible enough: it can only find fixed text, not regular expressions.
If you want to find a line that contains an asterisk character in a file or output
Fgrep ' * '/etc/profilefor i in/etc/profile.d/*.sh; do or grep-f ' * '/etc/profilefor i in/etc/profile.d/*.sh; do
grep search command under Linux (ii)