[Reprint] Powerful grep usage explained: grep and regular expressions

Source: Internet
Author: User
Tags egrep

The first thing to remember is that regular expressions are not the same as wildcards, and they represent different meanings!
A regular expression is just a notation, and the tool can handle a string of regular expressions as long as the tool supports that notation. Vim, grep, awk, sed all support regular expressions, it is because they support the regular, it is strong; In previous companies, because the company is a web-based Service site (Nginx), the demand for regular is relatively large, so it also took some time to study the regular, Special Share with you:

1 Basic Regular Expressions

grep tool, previously introduced.
grep-[ACINV] ' search content string ' filename
-A search in text file mode
-C calculates the number of rows found to match
-I ignores case
-N Output line number
-V reverse selection, which is to find rows without a search string
where the search string can be a regular expression!

1
Search for the line that has the, and lose the travel number
$grep-n ' the ' Regular_express.txt
Search without the line, and lose the travel number
$grep-nv ' the ' regular_express.txt

2 use [] to search for collection characters
[] denotes one of these characters, for example [ADE] denotes a or D or E
[Email protected]:~/tmp$ grep-n ' t[ae]st ' regular_express.txt
8:i can ' t finish the test.
9:oh! The Soup taste good!

You can use the ^ symbol to do a prefix within [], representing characters other than the characters in the [].
For example, search for the line where the string without G before OO. Use ' [^g]oo ' as the search string
[Email protected]:~/tmp$ grep-n ' [^g]oo ' Regular_express.txt
2:apple is my favorite food.
3:football game isn't use feet only.
18:google is the best tools for search keyword.
19:goooooogle yes!

[] can be expressed in the range, such as [A-z] for lowercase letters, [0-9] for 0~9, [A-z] is capital letters. [a-za-z0-9] denotes all numbers and English characters. Of course, you can also match ^ to exclude characters.
Search for rows that contain numbers
[Email protected]:~/tmp$ grep-n ' [0-9] ' regular_express.txt
5:however, this dress was about $3183 dollars.
15:you is, the best is menu, the the best.

The beginning of the line with the trailing character ^ $. ^ Represents the beginning of the line, $ means the end of the line (not the character, is the position) then ' ^$ ' represents a blank line, because only
The beginning and end of the line.

Here ^ differs from the ^ meaning used inside []. It indicates that the following string is the beginning of the row.
For example, search for the line at the beginning
[Email protected]:~/tmp$ grep-n ' ^the ' regular_express.txt
12:the symbol ' * ' is represented as star.

Search for lines that start with lowercase letters
[Email protected]:~/tmp$ grep-n ' ^[a-z] ' regular_express.txt
2:apple is my favorite food.
4:this dress doesn ' t fit me.
10:motorcycle is cheap than car.
12:the symbol ' * ' is represented as star.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! Go! Let ' s go.
[Email protected]:~/tmp$

Search for lines that begin with a letter that is not English
[Email protected]:~/tmp$ grep-n ' ^[^a-za-z] ' regular_express.txt
1: "Open Source" is a good mechanism to develop programs.
: #I am Vbird
[Email protected]:~/tmp$

$ indicates that the string in front of it is at the end of the line, such as '/. '. At the end of a row
The line at the end of the search is.
[Email protected]:~/tmp$ grep-n '/.$ ' regular_express.txt//. is a special symbol for regular expressions, so use/escape
1: "Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:football game isn't use feet only.
4:this dress doesn ' t fit me.
5:however, this dress was about $3183 dollars.
6:GNU is free air isn't free beer.
.....

Note the text file generated under the MS System, with a ^m character added. So the last character will be the hidden ^m, in the processing windows
Pay special attention to the following text!
Can be used with Cat Dos_file | Tr-d '/R ' > Unix_file to remove the ^m symbol. ^m==/r

then ' ^$ ' means only the empty line at the end of the line!
Search for empty lines
[Email protected]:~/tmp$ grep-n ' ^$ ' regular_express.txt
22:
23:
[Email protected]:~/tmp$

Search for non-empty rows
[Email protected]:~/tmp$ grep-vn ' ^$ ' regular_express.txt
1: "Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:football game isn't use feet only.
4:this dress doesn ' t fit me.
..........

any one character. With repeating characters *

in Bash, * represents a wildcard character, which is used to represent any number of characters, but in a regular expression, he has a different meaning, * indicates that there are 0 or more characters.
For example, oo* indicates that the first o must exist, that the second o can have one or more, or not, and therefore represents at least one o.

Point. Represents an arbitrary character that must exist. G?? D can be used with ' G. d ' means. Good, Gxxd, gabd ..... are consistent.

[Email protected]:~/tmp$ grep-n ' G.. d ' Regular_express.txt
1: "Open Source" is a good mechanism to develop programs.
9:oh! The Soup taste good!
16:the World was the same with ' glad '.
[Email protected]:~/tmp$

Search for two + O strings
[Email protected]:~/tmp$ grep-n ' ooo* ' regular_express.txt//First two o must exist, the third o may not, there can be more than one.
1: "Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:football game isn't use feet only.
9:oh! The Soup taste good!
18:google is the best tools for search keyword.
19:goooooogle yes!

Search for G start and end, middle is at least one o string, namely Gog, Goog....gooog ... such as
[Email protected]:~/tmp$ grep-n ' goo*g ' regular_express.txt
18:google is the best tools for search keyword.
19:goooooogle yes!

Search for the lines of G start and end strings
[Email protected]:~/tmp$ grep-n ' g.*g ' regular_express.txt//. * Denotes 0 or more arbitrary characters
1: "Open Source" is a good mechanism to develop programs.
14:the GD Software is a library for drafting programs.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! Go! Let ' s go.


Limit the range of consecutive repeating characters {}

. * Limit only 0 or more, and use {range} if you want to limit the exact number of characters to repeat. The range is digital, separated 2,5 represents 2~5,
2 means 2, 2, 2 to more
Note that because {} has special meaning in the shell, it is used/escaped as a regular expression.

Searches for a line that contains two o strings.
[Email protected]:~/tmp$ grep-n ' o/{2/} ' regular_express.txt
1: "Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:football game isn't use feet only.
9:oh! The Soup taste good!
18:google is the best tools for search keyword.
19:goooooogle yes!

Search for the line after G followed by a 2~5 O, followed by a G string.
[Email protected]:~/tmp$ grep-n ' go/{2,5/}g ' regular_express.txt
18:google is the best tools for search keyword.


The search consists of g followed by more than 2 O, followed by a line with G:
[Email protected]:~/tmp$ grep-n ' go/{2,/}g ' regular_express.txt
18:google is the best tools for search keyword.
19:goooooogle yes!


Note that the ^ in the [] is not special meaning, it can be placed in [] behind the content.
' [^a-z/.! ^-] ' means no lowercase letters, no. No, no spaces, no--the string, note [] There is a small space inside.

In addition, the reverse selection inside the shell is [!range], while the inside is [^range]


2 extending regular Expressions

An extended regular expression adds several special compositions to the underlying regular expression.
It makes certain operations more convenient.
For example, we want to remove the blank line and the beginning of the #的行, this will be used:
[Email protected]:~/tmp$ grep-v ' ^$ ' regular_express.txt | Grep-v ' ^# '
"Open Source" is a good mechanism to develop programs.
Apple is my favorite food.
Football game isn't use feet only.
This dress doesn ' t fit me.
............

Using Egrep and extended special symbols that support extended regular expressions, however | , it will be much more convenient.
Note that grep supports only underlying expressions, while Egrep supports extensions, in fact Egrep is an alias for GREP-E. Therefore, the GREP-E supports extended regular.
So:
[Email protected]:~/tmp$ egrep-v ' ^$|^# ' regular_express.txt
"Open Source" is a good mechanism to develop programs.
Apple is my favorite food.
Football game isn't use feet only.
This dress doesn ' t fit me.
....................
Here | Represents or the relationship. A string that satisfies ^$ or ^#.

several extended special symbols are listed here:
+, ON. * Acts Similarly, representing one or more repeating characters.
?, in. * Acts Similarly, representing 0 or one character.
An expression or relationship, such as ' Gd|good|dog ', that indicates a string of Gd,good or dog
() to synthesize part of the content into a group of cells. For example, to search for glad or good can do this ' G (la|oo) d '
() The advantage of being able to use + for the group? such as
For example, to search for a and C at the beginning of the end, there is at least one (XYZ) string, you can: ' A (xyz) +c '

[Reprint] Powerful grep usage explained: grep and regular expressions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.