grep usage detailed grep and regular expressions _ regular expressions

Source: Internet
Author: User
Tags lowercase egrep
A regular expression is just a notation, and as long as the tool supports this notation, the tool can handle the string of regular expressions. Vim, grep, awk, and sed all support regular expressions. It is also because they support the regular, it appears that they are strong; in the company that used to work, because the company is a web-based service-oriented web site (nginx), the demand for regular is relatively large, so also spent a little time to study the regular, Special to share with you under:

1 Base Regular Expressions
grep tool, described previously.
grep-[ACINV] ' search content string ' filename
-A text file search
-C calculates the number of matched rows found
-I ignores case
-N to output the line number in passing
-V reverse selection, that is, to find a line without a search string
Where the search string can be a regular expression!

1
Search for the line and the trip number
$grep-n ' the ' Regular_express.txt
Search without the line, and the trip number
$grep-nv ' the ' regular_express.txt

2 Search for collection characters using []
[] denotes one of these characters, for example [Ade] represents a or D or E
woody@xiaoc:~/tmp$ grep-n ' t[ae]st ' regular_express.txt
8:i can ' t finish the test.
9:oh! The Soup taste good!

A prefix in [] can be used to represent a character other than a character in [].
For example, search for the line that contains the string with no G before Oo. Use ' [^g]oo ' as a search string
woody@xiaoc:~/tmp$ grep-n ' [^g]oo ' Regular_express.txt
2:apple is my favorite food.
3:football game isn't use feet only.
18:google is the best tools for search keyword.
19:goooooogle yes!

[] can be expressed in scope, such as [A-z] for lowercase letters, [0-9] for 0~9 digits, [A-z] for uppercase letters. [A-za-z0-9] represents all numbers and English characters. Of course, you can also match ^ to exclude characters.
Search for rows that contain numbers
woody@xiaoc:~/tmp$ grep-n ' [0-9] ' regular_express.txt
5:however, this dress is about $3183 dollars.
15:you are the "Best is" menu you are the No.1.

The beginning and end of the line character ^ $. ^ Represents the beginning of a line, and $ indicates the end of the line (not the character, the position) so ' ^$ ' means a blank line, because only
The beginning and end of a line.
here ^ with [] inside used ^ meaning is different. It means that the string behind the ^ is the beginning of the row.
Like searching the line at the beginning.
woody@xiaoc:~/tmp$ grep-n ' ^the ' regular_express.txt
12:the symbol ' * ' is represented as star.

Search for a line that starts with a lowercase letter
woody@xiaoc:~/tmp$ grep-n ' ^[a-z] ' regular_express.txt
2:apple is my favorite food.
4:this dress doesn ' t fit me.
10:motorcycle is cheap the than car.
12:the symbol ' * ' is represented as star.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! Go! Let ' s go.
woody@xiaoc:~/tmp$

Search for lines with no English letters at the beginning
woody@xiaoc:~/tmp$ grep-n ' ^[^a-za-z] ' regular_express.txt
1: "Open Source" is a good mechanism to develop programs.
#I am Vbird
woody@xiaoc:~/tmp$

$ indicates that the string in front of it is the end of the row, such as '. '. At the end of a line
The line at the end of the search is.
woody@xiaoc:~/tmp$ grep-n ' \.$ ' regular_express.txt//. is a special symbol for regular expressions, so you use the \ Escape
1: "Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:football game isn't use feet only.
4:this dress doesn ' t fit me.
5:however, this dress is about $3183 dollars.
6:gnu is free air beer.
.....

Note the text file generated under the MS System, with a ^m character. So the last character is going to be hidden ^m in the process of Windows
Pay special attention to the following text!
can use Cat Dos_file | Tr-d ' \ r ' > Unix_file to remove the ^m symbol. ^m==\r

Then ' ^$ ' means only the empty line at the end of the line.
Search for Blank Lines
woody@xiaoc:~/tmp$ grep-n ' ^$ ' regular_express.txt
22:
23:
woody@xiaoc:~/tmp$

Search for Non-empty rows
woody@xiaoc:~/tmp$ grep-vn ' ^$ ' regular_express.txt
1: "Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:football game isn't use feet only.
4:this dress doesn ' t fit me.
..........

Any one character. With repeating character *

In Bash * represents a wildcard character, which is used to represent arbitrary characters, but in a regular expression, he has a different meaning, * indicates that there are 0 or more characters.
For example, oo*, which means that the first o must exist, the second o can have one or more, or not, and therefore represents at least one o.

Point. Represents an arbitrary character that must exist. G?? D can be used ' G.. d ' says. Good, Gxxd, gabd ... All fit.

woody@xiaoc:~/tmp$ Grep-n ' G.. d ' Regular_express.txt
1: "Open Source" is a good mechanism to develop programs.
9:oh! The Soup taste good!
16:the world are the same with ' glad '.
woody@xiaoc:~/tmp$

Search for a string of two o above
woody@xiaoc:~/tmp$ grep-n ' ooo* ' regular_express.txt//The first two o must exist, the third o may not be, there may be multiple.
1: "Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:football game isn't use feet only.
9:oh! The Soup taste good!
18:google is the best tools for search keyword.
19:goooooogle yes!

Search for the beginning and end of G, the middle is at least one o string, namely Gog, Goog....gooog ... Wait
woody@xiaoc:~/tmp$ grep-n ' goo*g ' regular_express.txt
18:google is the best tools for search keyword.
19:goooooogle yes!

Search for the line at the beginning and end of G
woody@xiaoc:~/tmp$ grep-n ' g.*g ' regular_express.txt//. * Represents 0 or more arbitrary characters
1: "Open Source" is a good mechanism to develop programs.
14:the GD Software is a library for drafting programs.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! Go! Let ' s go.


Limit the range of consecutive repeating characters {}
. * Limit only 0 or more, and use {range} if you want to limit the number of characters to repeat exactly. The range is a number, separated by the 2,5 means 2~5,
2 means 2, 2, 2 to more
Note that because {} has special meaning in the shell, it is used as a regular expression to escape it.

Searches for a line containing two o strings.
woody@xiaoc:~/tmp$ grep-n ' o\{2\} ' regular_express.txt
1: "Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:football game isn't use feet only.
9:oh! The Soup taste good!
18:google is the best tools for search keyword.
19:goooooogle yes!

Search g followed by a 2~5 O, followed by a string of G's line.
woody@xiaoc:~/tmp$ grep-n ' go\{2,5\}g ' regular_express.txt
18:google is the best tools for search keyword.


The search contains g followed by 2 or more o, followed by a line of G.
woody@xiaoc:~/tmp$ grep-n ' go\{2,\}g ' regular_express.txt
18:google is the best tools for search keyword.
19:goooooogle yes!


Notice that the "^" in [] does not show special meaning and can be placed behind the contents of [].
' [^a-z\.! ^-] ' means no lowercase letters, no. No, no spaces, no strings, notice [] There's a little space inside.

Another shell inside the reverse selection for [!range], is inside is [^range]


2 Extended Regular Expressions

An extended regular expression adds several special compositions to the underlying regular expression.
It makes certain operations more convenient.
For example, if we want to remove blank lines and start at the beginning of the #的行, we will use:
woody@xiaoc:~/tmp$ grep-v ' ^$ ' regular_express.txt | Grep-v ' ^# '
' Open Source is ' a good mechanism to develop programs.
Apple is my favorite food.
Football game isn't use feet only.
This is dress doesn ' t fit me.
............

However, egrep and extended special symbols are used to support extended regular Expressions | , it will be much more convenient.
Note GREP only supports the underlying expression, and Egrep supports extensions, but Egrep is Grep-e's alias. Therefore, the GREP-E supports extended regularization.
So:
woody@xiaoc:~/tmp$ egrep-v ' ^$|^# ' regular_express.txt
' Open Source is ' a good mechanism to develop programs.
Apple is my favorite food.
Football game isn't use feet only.
This is dress doesn ' t fit me.
....................
Here | The relationship of the expression or. A string that satisfies ^$ or ^#.

Here are a few extended special symbols:
+, in. * function similar, representing one or more repeating characters.
?, in. * function similar, representing 0 or one character.
The expression or relationship, such as ' Gd|good|dog ', indicating a string of Gd,good or dog.
() to synthesize part of the content into a single group of cells. For example, to search for glad or good can do so ' g (la|oo) d '
() The advantage is that the team can use +? Wait
For example, to search for a and C endings, there is at least one (XYZ) string in the middle, so you can: ' A (xyz) +c '

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.