Detailed usage of grep: grep and Regular Expression

Source: Internet
Author: User

 

The first thing to remember is that regular expressions and wildcards have different meanings!
A regular expression is only a representation. As long as the tool supports this representation, the tool can process the string of a regular expression. Vim, grep, awk, and sed both support regular expressions, and they seem powerful because they support regular expressions. In previous companies, as the company is a web-based service-type website (nginx) with a high requirement for regular expressions, it also took some time to study regular expressions. I would like to share with you the following:

1 Basic Regular Expression

Grep tool, which was previously introduced.
Grep-[acinv] 'search content string' filename
-A search by text file
-C: calculates the number of matching rows.
-I case-insensitive
-N by the way, the output row number
-V reverse selection, that is, finding rows without a search string
The search string can be a regular expression!

1
Search for the row with the and output the row number.
$ Grep-n 'The 'regular_express.txt
Search for the rows without the and output the row number
$ Grep-NV 'the regular_express.txt

2Use [] to search for character sets
[] Indicates a character. For example, [Ade] indicates A, D, or E.
Woody @ xiaoc :~ /Tmp $ grep-n
'T [AE] st' regular_express.txt
8: I can't finish the test.
9: Oh! The soup taste good!

You can use the ^ symbol as the prefix in [] to indicate characters other than the characters in.
For example, you can search for the row of a string without g before oo. Use '[^ g] oo' as the search string.
Woody @ xiaoc :~ /Tmp $ grep-n' [^ g] oo 'regular_express.txt
2: apple is my favorite food.
3: Football game is not use feet only.
18: google is the best tools for search keyword.
19: goooooogle yes!

[] Can be expressed in a range. For example, [a-z] indicates lowercase letters, and [0-9] indicates 0 ~ 9 digits, [A-Z] are uppercase letters. [A-zA-Z0-9] represents all numbers and English characters. Of course, you can also use ^ to exclude characters.
Search for rows containing numbers
Woody @ xiaoc :~ /Tmp $ grep-n '[0-9]' regular_express.txt
5: However, this dress is about $3183 dollars.
15: You are the best is menu you are the no.1.

The beginning and end of a line ^ $. ^ Indicates the beginning of the row, and $ indicates the end of the row (not a character, it is a position ).'^ $' Indicates empty rows.Because only
The beginning and end of a row.

Here ^ is different from the ^ used in. It indicates that the string after ^ is the beginning of a row.
For example, search for the rows starting with
Woody @ xiaoc :~ /Tmp $ grep-n' ^ the 'regular_express.txt
12: the symbol '*' is represented as star.

Search for rows starting with lowercase letters
Woody @ xiaoc :~ /Tmp $ grep-n' ^ [a-z] 'regular_express.txt
2: apple is my favorite food.
4: this dress doesn' t fit me.
10: motorcycle is cheap than car.
12: the symbol '*' is represented as star.
18: google is the best tools for search keyword.
19: goooooogle yes!
20: go! Go! Let's go.
Woody @ xiaoc :~ /Tmp $

Search for rows starting with not English letters
Woody @ xiaoc :~ /Tmp $ grep-n '{}a-za-z}'regular_express.txt

1: "Open Source" is a good mechanic to develop programs.
21: # I am VBird
Woody @ xiaoc :~ /Tmp $

$ Indicates the end of a row. For example, '\.' indicates the end of a row.
The row at the end of the search is.
Woody @ xiaoc :~ /Tmp $ grep-n' \ .w.'regular_express.txt
//. It is a special symbol of a regular expression, so use \ escape
1: "Open Source" is a good mechanic to develop programs.
2: apple is my favorite food.
3: Football game is not use feet only.
4: this dress doesn' t fit me.
5: However, this dress is about $3183 dollars.
6: GNU is free air not free beer.
.....

Note that for text files generated in MS, a ^ M character is added to the line feed. So the final character will be hidden ^ M.
Pay special attention to the following text!
You can use cat dos_file | tr-d' \ R'> unix_file to delete the ^ M symbol. ^ M = \ r

Then '^ $' indicates that only empty rows at the end of the first line are pulled!
Search empty rows
Woody @ xiaoc :~ /Tmp $ grep-n' ^ $ 'regular_express.txt
22:
23:
Woody @ xiaoc :~ /Tmp $

Search for non-empty rows
Woody @ xiaoc :~ /Tmp $ grep-vn '^ $ 'regular_express.txt
1: "Open Source" is a good mechanic to develop programs.
2: apple is my favorite food.
3: Football game is not use feet only.
4: this dress doesn' t fit me.
..........

Any character. Repeated characters *

In bash, * represents a wildcard, which is used to represent any character. However, in a regular expression, * represents 0 or more characters with different meanings.
For example, oo * indicates that the first o must exist, and the second o can have one or more or no, so it represents at least one o.

Point. represents any character and must exist.G ?? D can be expressed in 'G .. d. Good, gxxd, gabd... all match.

Woody @ xiaoc :~ /Tmp $ grep-n 'G .. d' regular_express.txt
1: "Open Source" is a good mechanic to develop programs.
9: Oh! The soup taste good!
16: The world is the same with 'Glad '.
Woody @ xiaoc :~ /Tmp $

Search for strings with more than two o types
Woody @ xiaoc :~ /Tmp $ grep-n 'ooo * 'regular_express.txt // The first two o must exist. The third o may not exist, or multiple.
1: "Open Source" is a good mechanic to develop programs.
2: apple is my favorite food.
3: Football game is not use feet only.
9: Oh! The soup taste good!
18: google is the best tools for search keyword.
19: goooooogle yes!

Search for the start and end of g, with at least one o string in the middle, that is, gog, goog... gooog...
Woody @ xiaoc :~ /Tmp $ grep-n
'Goo * G' regular_express.txt
18: google is the best tools for search keyword.
19: goooooogle yes!

Search for the rows in which the string starting and ending with g is located.
Woody @ xiaoc :~ /Tmp $ grep-n 'G. * G' regular_express.txt //. * Indicates 0 or multiple arbitrary characters.
1: "Open Source" is a good
Mechanisms to develop programs.
14: The gd software is a library for drafting programs.
18: google is the best tools for search keyword.
19: goooooogle yes!
20: go! Go! Let's go.


Limit the range of consecutive repeated characters {}

. * Only 0 or more characters are allowed. to limit the number of characters, use {range }. The range is numbers. 2 to 5 are separated to indicate 2 ~ Five,
2 indicates 2, 2 indicates 2 to more
Note that {} has special meaning in SHELL, so use \ escape as a regular expression.

Search for rows with strings containing two o values.
Woody @ xiaoc :~ /Tmp $ grep-n 'o \ {2 \} 'regular_express.txt
1: "Open Source" is a good mechanic to develop programs.
2: apple is my favorite food.
3: Football game is not use feet only.
9: Oh! The soup taste good!
18: google is the best tools for search keyword.
19: goooooogle yes!

Search for g followed by 2 ~ Five o, followed by a string of g.
Woody @ xiaoc :~ /Tmp $ grep-n
'Go \ {2, 5 \} G' regular_express.txt
18: google is the best tools for search keyword.

Search for rows that contain more than two o numbers after g and followed by g ..
Woody @ xiaoc :~ /Tmp $ grep-n 'go \ {2, \} G' regular_express.txt
18: google is the best tools for search keyword.
19: goooooogle yes!


Note: The ^-in [] does not have any special meaning. It can be placed behind the content in.
'[^ A-z \.! ^-] 'Indicates no lower-case letters, no. No !, No space, no-string. Note that there is a small space in.

In addition, the reverse selection in shell is [! Range], which is [^ range] in the regular expression.

2. Extended Regular Expression

The extended regular expression adds several special components to the basic regular expression.
It makes some operations more convenient.
For example, if we want to remove blank rows and rows whose first line is #, we will use the following:
Woody @ xiaoc :~ /Tmp $ grep-v '^ $ 'regular_express.txt | grep-V' ^ #'
"Open Source" is a good mechanic to develop programs.
Apple is my favorite food.
Football game is not use feet only.
This dress doesn' t fit me.
............

However, it is much easier to use egrep and extended special symbols that support extended regular expressions.
Note that grep only supports basic expressions, while egrep supports extensions. In fact, egrep is the alias of grep-E. Therefore, grep-E supports extended regular expressions.
So:
Woody @ xiaoc :~ /Tmp $ egrep-v
'^ $ | ^ #' Regular_express.txt
"Open Source" is a good mechanic to develop programs.
Apple is my favorite food.
Football game is not use feet only.
This dress doesn' t fit me.
....................
Here | represents the or relationship. It is a string that meets the conditions of ^ $ or ^.

Here are several extended special symbols:
+, Similar to. *, indicates one or more repeated characters.
?, Similar to. *, it indicates 0 or one character.
|, Or relationship. For example, 'gd | good | dog' indicates a string with gd, good, or dog.
() To combine part of the content into a unit group. For example, to search for gglad or good, You Can 'G (la | oo) D' like this'
() The advantage is that it can be used for groups +? .
For example, if you want to search for at least one (xyz) string at the beginning and end of A and C, you can: 'A (xyz) + C'
From: http://apps.hi.baidu.com/share/detail/34694903

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.