Linux Command--grep | Regular expressions

Source: Internet
Author: User
Tags character classes dmesg egrep

The feeling speaks very detailed, instantly understood the grep, regular.

From:here

Brief introduction

grep (Global search Regular expression (RE) and print out of the line, full search of regular expressions and print out rows) is a powerful text search tool that uses regular expressions to search for text. and print out the matching lines.

The grep family of Unix includes grep, Egrep, and Fgrep. Egrep and Fgrep commands are only a small difference from grep. Egrep is the extension of grep, which supports more re metacharacters, and fgrep is fixed grep or fast grep, which regards all the letters as words, that is, the metacharacters in the regular expression represents the literal meaning back to itself, no longer special. Linux uses the GNU version of grep. It is more powerful and can use the Egrep and FGREP functions with the-G,-e,-f command line options.

grep common usage
[[email protected] ~]# grep [-ACINV] [--color=auto] ' search string ' filename options and Parameters:-A: Binary file search data as text file-c: Calculate find ' search string ' The number of times-I: Ignores the difference in case, so the case is treated as the same-N: the output line number-V: Reverse selection, that is, the line that does not have a ' search string ' content! --color=auto: Can be found in the keyword part of the color display Oh!

Remove the/etc/passwd, where Root is present.

# grep Root/etc/passwdroot:x:0:0:root:/root:/bin/bashoperator:x:11:0:operator:/root:/sbin/nologin or # cat/etc/ passwd | grep Root Root:x:0:0:root:/root:/bin/bashoperator:x:11:0:operator:/root:/sbin/nologin

Remove the/etc/passwd, which is the root row, and display the line numbers of the rows in/etc/passwd

# grep-n Root/etc/passwd1:root:x:0:0:root:/root:/bin/bash30:operator:x:11:0:operator:/root:/sbin/nologin

In terms of the display of keywords, grep can use--color=auto to display the keyword portion using color. This is a very good function Ah! But if you use grep every time you have to add--color=auto and show a lot of trouble ~ at this time the good alias will have to deal with it! You can add this line in ~/.BASHRC: "Alias grep= ' grep--color=auto '" and then "source ~/.BASHRC" to take effect immediately! So every time you run grep, he'll automatically add color to your display.

The/etc/passwd will be removed from the row where root is not present.

# grep-v Root/etc/passwdroot:x:0:0:root:/root:/bin/bashoperator:x:11:0:operator:/root:/sbin/nologin

The/etc/passwd will be removed and no root and Nologin rows will appear.

# grep-v ROOT/ETC/PASSWD | Grep-v Nologinroot:x:0:0:root:/root:/bin/bashoperator:x:11:0:operator:/root:/sbin/nologin

Use DMESG to list the core information, then grep to find the line containing the ETH, to be caught in the keyword color, and add line number to indicate:

[Email protected] ~]# DMESG |  Grep-n--color=auto ' eth ' 247:eth0:realtek RTL8139 at 0xee846000, 00:90:cc:a6:34:84, IRQ 10248:eth0:identified 8139 chip Type ' rtl-8139c ' 294:eth0:link up, 100Mbps, Full-duplex, LPA 0xc5e1305:eth0:no IPv6 routers present# you'll find that in addition to the ETH there will be special colors to watch Above, there is a line number at the front!

In terms of the display of keywords, grep can use--color=auto to display the keyword portion using color. This is a very good function Ah! But if you use grep every time you have to add--color=auto and show a lot of trouble ~ at this time the good alias will have to deal with it! You can add this line in ~/.BASHRC: "Alias grep= ' grep--color=auto '" and then "source ~/.BASHRC" to take effect immediately! So every time you run grep, he'll automatically add color to your display.

Use DMESG to list the core information, and then grep to find the line containing the ETH, in the first two lines of the keyword row and the last three lines are also caught out of the display

[Email protected] ~]# DMESG | Grep-n-a3-b2--color=auto ' eth ' 245-pci:setting IRQ Ten as Level-triggered246-acpi:pci Interrupt 0000:00:0e.0[a] Link [lnkb] ... 247:eth0:realtek RTL8139 at 0xee846000, 00:90:cc:a6:34:84, IRQ 10248:eth0:identified 8139 chip type ' rtl-8139c ' 249-input : PC Speaker as/class/input/input2250-acpi:pci Interrupt 0000:00:01.4[b]-Link [lnkb] ... 251-HDB:ATAPI 48X DVD-ROM dvd-r-ram cd-r/rw drive, 2048kB Cache, UDMA (66) # as shown above, you will find that the first two lines of the keyword 247 and 248 after three lines are also displayed! # This allows you to take the keyword back and forth data to analyze it!

Finding a directory recursively based on file content

# grep ' energywise ' *           #在当前目录搜索带 ' energywise ' line file # grep-r ' energywise ' *        #在当前目录及其子目录下搜索 ' energywise ' of the line file
# grep-l-R ' energywise ' *     #在当前目录及其子目录下搜索 ' energywise ' file, but does not display matching lines, only matching files are displayed

These commands are very useful and are a great tool for finding files.

grep and regular Expressions

Character class

Search for character classes: If I want to search for the two words of test or taste, I can find that they have a common ' t?st ' existence ~ this time, I can search for:

[[email protected] ~]# grep-n ' t[ae]st ' regular_express.txt8:i can ' t finish the test.9:oh! The soup taste good.


In fact [] in a few bytes, he would like to represent a "one" byte, so, the above example shows that the string I need is "tast" or "test" two string only!

Inverse selection of character classes [^]: If you want to search for a line that has OO, but do not want to have g in front of Oo, the following

[[email protected] ~]# grep-n ' [^g]oo ' regular_express.txt2:apple is my favorite Food.3:football game was not use feet onl Y.18:google is the best tools for search Keyword.19:goooooogle yes!

2nd, 3 lines no doubt, because Foo and foo can be accepted!

But the 18th line has Google's goo Ah ~ Don't forget, because the line behind the tool too ah! So the line is also listed-that is, 18 lines inside although we do not have the item (goo) but because of the need for the project (too), so it is in line with string search Oh!

As in line 19th, the same is true because goooooogle in front of OO may be o, for example: Go (ooo) Oogle, so this line is also in line with the demand!

Continuous character class: Again, suppose I don't want to have a small write section in front of Oo, so I can write [^abcd....z]oo, but this doesn't seem so convenient, because the order of encoding on the ASCII of lowercase bytes is sequential, so we can simplify it to the bottom:

[[email protected] ~]# grep-n ' [^a-z]oo ' Regular_express.txt3:Football game is isn't use feet only.

That is, when we are in a set of collection bytes, if the byte group is contiguous, such as uppercase English/lowercase english/numeric, etc., you can use [a-z],[a-z],[0-9] and other ways to write, then if our request string is the number and English? Oh! He wrote it all together and became: [A-za-z0-9].

We're going to get the line with numbers, and that's it:

[Email protected] ~]# grep-n ' [0-9] ' regular_express.txt5:However, this dress was about $3183 dollars.15:you was the BES T is mean 1.

Beginning of line with trailing bytes ^ $
Line beginning character: What if I want the only one listed at the beginning of the line? This is the time to use the anchor byte! We can do this:

[Email protected] ~]# grep-n ' ^the ' regular_express.txt12:the symbol ' * ' is represented as start.


At this point, only the 12th line is left, because only the start of line 12th is the beginning of Ah ~ Also, if I want to start with the small section of the line is listed? Can do this:

[[email protected] ~]# grep-n ' ^[a-z] ' regular_express.txt2:apple is my favorite food.4:this dress doesn ' t fit Me.10:moto Rcycle is cheap than car.12:the symbol ' * ' is represented as Start.18:google are the best tools for search Keyword.19:goooo Oogle yes!20:go! Go! Let ' s go.

If I don't want to start with an English letter, it can be:

[Email protected] ~]# grep-n ' ^[^a-za-z] ' regular_express.txt1: ' Open Source ' is a good mechanism to develop programs.21: # I AM Vbird

^ symbols, which are different from outside of the character class symbols (brackets [])! In [] represents the "reverse selection", outside [] represents the meaning of positioning at the beginning of the line!

So if I want to find out, the line ending at the end of the row is the decimal point (.):

[Email protected] ~]# grep-n ' \.$ ' regular_express.txt1: "Open Source" is a good mechanism to develop programs.2:apple are  My favorite Food.3:football game isn't use feet only.4:this dress doesn ' t fit me.10:motorcycle is cheap than car.11:this window is clear.12:the symbol ' * ' is represented as Start.15:you was the best was mean you and the No. 1.16:the World < Happy> is the same with "glad". 17:i like Dog.18:google are the best tools for search keyword.20:go! Go! Let ' s go.


It is important to note that because the decimal point has other meanings (described below), you must use the escape character (\) to remove its special meaning!

Find the blank line:

[Email protected] ~]# grep-n ' ^$ ' regular_express.txt22:

Because only the beginning and the end of the line (^$), so that you can find the blank line!

Any one of the bytes. With repeating bytes *
The meanings of these two symbols in regular expressions are as follows:

. (decimal point): means "must have an arbitrary byte" meaning; * (asterisk): represents "repeating the previous character, 0 to infinity" meaning, for the combined form

Suppose I need to find out g?? A string of D, that is, a total of four bytes, beginning with G and ending with D, I can do this:

[Email protected] ~]# grep-n ' G.. d ' regular_express.txt1: "Open Source" is a good mechanism to develop programs.9:oh! The Soup taste good.16:the World <Happy> are the same with "glad".

Because the emphasis between G and D must exist two bytes, so the 13th line of God and the 14th line of GD will not be listed!

What if I want to list data that has Oo, OOO, oooo, and so on, which means at least two (including) O?

Because * represents the meaning of "repeating 0 or more of the preceding RE characters", "o*" stands for "having empty bytes or an O or more bytes", so "Grep-n ' o* ' Regular_express.txt" will print out all the data on the terminal!

When we need a "minimum of two o ' strings", we need ooo*, i.e.:

[Email protected] ~]# grep-n ' ooo* ' regular_express.txt1: "Open Source" is a good mechanism to develop programs.2:apple I s my favorite Food.3:football game isn't use feet only.9:oh! The soup taste good.18:google is the best tools for search Keyword.19:goooooogle yes!

If I want the string to start with the end is G, but two g can only exist between at least one o, that is, Gog, Goog, Gooog .... Wait, what's that supposed to be?

[Email protected] ~]# grep-n ' goo*g ' Regular_express.txt18:google is the best tools for search Keyword.19:goooooogle Yes !

If I want to find the line where G begins and ends with G, the characters are optional

[Email protected] ~]# grep-n ' g.*g ' regular_express.txt1: "Open Source" is a good mechanism to develop programs.14:the GD Software is a library for drafting programs.18:google are the best tools for search Keyword.19:goooooogle yes!20:go! Go! Let ' s go.


Since it is the beginning of G and the end of G, any byte in the middle can be accepted, so the 1th, 14, 20 lines are acceptable Oh! this. * RE means that any character is very common.

What if I want to find the line of "any number"? Because there are only numbers, it becomes:

[[email protected] ~]# grep-n ' [0-9][0-9]* ' regular_express.txt5:However, this dress was about $3183 dollars.15:you was T He is mean-the No. 1.

Qualifying Continuous RE character range {}
We can use it. With the re character and * To configure 0 to infinitely multiple repeating bytes, then what if I want to limit the number of repetitions in a range of intervals?

For example, I want to find out two to five o continuous string, how to do? This is the time to use the qualified character {}. But since the symbol {and} is of special significance in the shell, we have to use the character \ To make him lose special meaning. The syntax for {} Is this, assuming I want to find two o strings, which can be:

[[email protected] ~]# grep-n ' o\{2\} ' regular_express.txt1: ' Open Source ' is a good mechanism to develop programs.2:apple Is my favorite Food.3:football game isn't use feet only.9:oh! The soup taste good.18:google is the best tools for search Ke19:goooooogle yes!

Let's say we're going to find out that G is followed by 2 to 5 O, then a string of G, and he will be like this:

[Email protected] ~]# grep-n ' go\{2,5\}g ' Regular_express.txt18:google are the best tools for search keyword.

What if I want 2 o ' goooo....g? In addition to being gooo*g, it can also be:

[Email protected] ~]# grep-n ' go\{2,\}g ' Regular_express.txt18:google are the best tools for search Keyword.19:goooooogle yes!

Extended grep (GREP-E or Egrep):
The main benefit of using the extended grep is the addition of additional regular expression meta-character sets.

Prints all rows that contain NW or EA. If you are not using Egrep, but grep, there will be no results detected.

    # egrep ' nw| EA ' testfile         Northwest       NW      Charles Main        3.0     98     3    Eastern         ea      TB Savage           4.4     .     5       20

For standard grep, if \,grep is preceded by an extension metacharacters, the extended option-E is automatically enabled.

#grep ' nw\| EA ' testfilenorthwest       NW      Charles Main        3.0     . 98     3       34eastern         ea      TB Savage           4.4     .     5       20

Searches for all rows that contain one or more 3.

# egrep ' testfile# grep-e ' testfile# grep ' 3\+ ' testfile        #这3条命令将会northwest       NW      Charles Main          3.0
   .98     3       34western         WE      Sharon Gray           5.3     .     5       23northeast       NE      AM Main Jr.           5.1     . 94     3       13central         CT      Ann Stephens          5.7     . 94     5       13

Searches for all rows that contain 0 or 1 decimal points characters.

# egrep ' 2\.? [0-9] ' testfile # grep-e ' 2\.? [0-9] ' testfile# grep ' 2\.\? [0-9] ' testfile #首先含有2字符, followed by 0 or 1 points, followed by a number between 0 and 9. Western         WE       Sharon Gray          5.3     .     5       23southwest       SW      Lewis dalsass         2.7     . 8      2       18eastern         EA       TB Savage             4.4     5       20

A row that searches for one or more contiguous no.

# egrep ' (NO) + ' testfile# grep-e ' (no) + ' testfile# grep ' \ (no\) \+ ' testfile   #3个命令返回相同结果, Northwest       NW      Charles Main        3.0     98     3       34northeast       NE       AM Main Jr.        5.1     . 94     3       13north           NO      Margot Weber        4.5     .     5       9

Do not use regular expressions

The Fgrep query is faster than the grep command, but not flexible enough: it can only find fixed text, not regular expressions.

If you want to find a line that contains an asterisk character in a file or output

Fgrep  ' * '/etc/profilefor i in/etc/profile.d/*.sh; do or grep-f ' * '/etc/profilefor i in/etc/profile.d/*.sh; do

Linux Command--grep | Regular expressions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.