Linux grep command Detailed _

Linux grep command Detailed __linux

Last Update:2018-07-27 Source: Internet

Author: User

Tags character classes dmesg egrep

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Brief Introduction

grep (Global search Regular expression (RE) and print out of the line, a comprehensive search for regular expressions and print out rows) is a powerful text search tool that uses regular expressions to search for text. and print out the matching rows.

The grep family of Unix includes grep, Egrep, and Fgrep. Egrep and Fgrep commands are only a small difference from grep. Egrep is an extension of grep that supports more re metacharacters, Fgrep is fixed grep or fast grep, which regards all letters as words, that is, the metacharacters in the regular expression returns to its own literal meaning and is no longer special. Linux uses the GNU version of grep. It is more powerful and can use the EGREP and FGREP functions through the-G,-E,-F command-line Options.

grep common usage

[Root@www ~]# grep [-ACINV] [--color=auto] ' search string ' filename
options and Parameters:-a
: To search the binary file in the text file-
c: Calculation find ' Search Find string ' number of times-
I: ignores case differences, so capitalization is treated as the same
-N: By the way output line number-
V: Reverse selection, which shows the line without the ' search string ' content.
--color=auto: Can be found in the keyword part of the color display OH.

Take the/etc/passwd, and the line that appears root is taken out.

# grep root/etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
or
# cat/etc/passwd | grep root 
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/ Nologin

/ETC/PASSWD, there is a row that appears root, and displays the line number of those lines in/etc/passwd

# grep-n root/etc/passwd
1:root:x:0:0:root:/root:/bin/bash
30:operator:x:11:0:operator:/root:/sbin/ Nologin

In terms of the display of keywords, grep can use--color=auto to display the keyword parts in color. This is a very good function ah. But if you have to use grep every time you have to add the--color=auto is also a very troublesome ~ at this time the useful alias to deal with it. You can add this line in the ~/.BASHRC: "Alias grep= ' grep--color=auto '" and "source ~/.BASHRC" to take effect immediately oh. So every time you run grep, he'll automatically add color to your display.

Will/etc/passwd, remove the row that does not appear root

# grep-v root/etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

Will/etc/passwd, and the rows that do not appear root and nologin will be removed.

# grep-v ROOT/ETC/PASSWD | Grep-v nologin
Root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

Use DMESG to list the core information, then grep to find the line containing ETH, the keyword to be caught, and add the line number to indicate:

[Root@www ~]# DMESG | Grep-n--color=auto ' eth '
247:eth0:realtek RTL8139 at 0xee846000, 00:90:cc:a6:34:84, IRQ
248:eth0:identifie d 8139 Chip type ' rtl-8139c '
294:eth0:link up, 100Mbps, Full-duplex, LPA 0xc5e1 305:eth0:no IPv6-routers present
# You'll find that in addition to ETH there will be a special color to indicate that there are line numbers at the front.

Use DMESG to list the core information, and then grep to find the line containing ETH, the first two lines of the key line and the last three lines are also captured together to show

[Root@www ~]# DMESG | Grep-n-a3-b2--color=auto ' eth '
245-pci:setting IRQ Ten as level-triggered 246-acpi:pci Interrupt
[A]-> Link [lnkb] ...
247:eth0:realtek RTL8139 at 0xee846000, 00:90:cc:a6:34:84, IRQ ten
248:eth0:identified 8139 chip type ' rtl-8139c ' 
  
   249-INPUT:PC Speaker as/class/input/input2
250-acpi:pci Interrupt 0000:00:01.4[b]-> Link [lnkb] ...
251-HDB:ATAPI 48X DVD-ROM dvd-r-ram cd-r/rw drive, 2048kB Cache, UDMA
# as shown above, you will find that the first two lines of keyword 247 and three lines in 248 are also displayed.
# This allows you to capture the data before and after the keyword analysis.

Recursively find directories based on file content

# grep ' energywise ' *           #在当前目录搜索带 ' energywise ' line file

# grep-r ' energywise ' *        #在当前目录及其子目录下搜索 ' energywise ' file

# grep-l-R ' energywise ' *     #在当前目录及其子目录下搜索 ' energywise ' files, but does not display matching rows, only matching files are displayed

These commands are very well used and are a sharp tool for finding files.

grep and regular Expressions

character class

Character class Search: If I want to search for test or taste these two words, I can find that they actually have the common ' t?st ' existence ~ this time, I can search for:

[Root@www ~]# grep-n ' t[ae]st ' regular_express.txt
8:i can ' t finish the test.
9:oh! The soup taste good.

In fact [] there are a few bytes in it, he has to represent a "one" byte, so the above example shows that the string I need is "tast" or "test" two strings.

Reverse selection of character classes [^]: If you want to search for OO rows, but don't want oo preceded by G, the following

[Root@www ~]# grep-n ' [^g]oo ' Regular_express.txt
2:apple is my favorite food.
3:football game isn't use feet only.
18:google is the best tools for search keyword.
19:goooooogle yes!

2nd, 3 lines are not questioned, because Foo and foo can be accepted.

But the 18th line clearly has Google's goo Ah ~ Don't forget, because the line behind the tool too ah. So the line is also listed ~ that is to say, there are 18 lines in which we do not want the item (goo) but due to the need for the item (too), therefore, is consistent with the string search Oh.

In line 19th, again, because the OO in goooooogle may be o, for example, Go (OOO) Oogle, so this line is also in demand.

Continuation of character class: Again, let's say I don't want to have a little writing section before Oo, so I can write [^abcd....z]oo, but that doesn't seem convenient, since the order of the ASCII encoding of lowercase bytes is sequential, so we can simplify it to the bottom:

[Root@www ~]# grep-n ' [^a-z]oo ' Regular_express.txt 3:football-is-not-use-game only
.

That is, when we are in a set of collection bytes, if the byte group is contiguous, such as uppercase/lowercase English/numerals, and so on, it can be written using [a-z],[a-z],[0-9], so if our request string is numeric and English. Oh. Write him all together and become: [a-za-z0-9].

We're going to get the number line, and that's it:

[Root@www ~]# grep-n ' [0-9] ' regular_express.txt 5:however
, this dress is about $3183 dollars.
15:you are the best is mean you are the No. 1.

Beginning and end of line byte ^ $
Line start character: If I want to make the the only list at the beginning of the line. This time you have to use the location byte. We can do this:

[Root@www ~]# grep-n ' ^the ' regular_express.txt
12:the symbol ' * ' is represented as start.

At this point, only line 12th is left, because only the beginning of the 12th line is the first one, and if I want the line that starts with the lowercase section, list it. You can do this:

[Root@www ~]# grep-n ' ^[a-z] ' regular_express.txt
2:apple is my favorite food.
4:this dress doesn ' t fit me.
10:motorcycle is cheap the than car.
12:the symbol ' * ' is represented as start.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! Go! Let ' s go.

If I don't want to start with an English letter, it can be this way:

[Root@www ~]# grep-n ' ^[^a-za-z] ' regular_express.txt
1: ' Open Source ' is a good mechanism to develop programs.
21:# I am Vbird

^ symbol, within the character class symbol (parentheses []) is different from outside. In [] represents "reverse selection", and outside [] represents the meaning of positioning at the beginning of a line.

So if I want to find out, the line ends with the decimal point (.):

[Root@www ~]# grep-n ' \.$ ' regular_express.txt
1: ' Open Source ' is a good mechanism to develop programs.
2:apple is my favorite food.
3:football game isn't use feet only.
4:this dress doesn ' t fit me.
10:motorcycle is cheap the than car.
11:this window is clear.
12:the symbol ' * ' is represented as start.
15:you are the best is mean you are the No. 1.
16:the World <Happy> are the same with "glad".
17:i like dog.
18:google is the best tools for search keyword.
20:go! Go! Let ' s go.

Note in particular that because the decimal point has other meanings (described below), you must use the escape character (\) to remove its special meaning.

Find Blank lines:

[Root@www ~]# grep-n ' ^$ ' regular_express.txt
22:

Because there is only the beginning and end of the line (^$), so you can find the blank line.

any byte. With duplicate bytes *
The meanings of these two symbols in regular expressions are as follows:

. (decimal point): Represents "must have an arbitrary byte" meaning;
* (asterisk): represents the meaning of "repeat the previous character, 0 to Infinity", for the combined form

Suppose I need to find G?? D's string, which is a total of four bytes, starting with G and ending with D, I can do this:

[Root@www ~]# grep-n ' G. d ' regular_express.txt
1: ' Open Source ' is a good mechanism to develop programs.
9:oh! The soup taste good.
16:the World <Happy> are the same with "glad".

Because the emphasis between G and D must exist two bytes, so the 13th line of God and the 14th line of GD will not be listed.

If I want to list the data for Oo, OOO, oooo, and so on, that is to say, there should be at least two (including) o above, what is the good.

Because * represents the meaning of "repeating 0 or more of the preceding RE characters", therefore, "o*" means "owning a byte or more of an O", so "Grep-n ' o* ' Regular_express.txt" will print out all the data on the terminal.

When we need "at least two o ' string", we need ooo*, which is:

[Root@www ~]# grep-n ' ooo* ' regular_express.txt
1: ' Open Source ' is a good mechanism to develop programs.
2:apple is my favorite food.
3:football game isn't use feet only.
9:oh! The soup taste good.
18:google is the best tools for search keyword.
19:goooooogle yes!

If I want the string to start and end with G, but only two g can exist between at least one O, i.e. Gog, Goog, Gooog .... Wait, how about that.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More