Text lookup commands and regular expressions under Linux

Last Update:2015-04-01 Source: Internet

Author: User

Tags uppercase letter egrep

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Marco said, not learning the regular expression is not the problem of Linux, is the problem of IQ.

Let's talk about how to find text under Linux, how to find text, and what commands to find text, here are the two commands we need to use are grep, Egrep, and Fgrep. First we need to know what grep is.

Grep:

GREP's English name is global search REgular expression and print out of the line.

It means full search for regular expressions and print out the lines, complete search and print out the lines we all know, regular expressions, and so on, let's get to know the new friend grep.

Usage of grep

grep [OPTIONS] PATTERN [FILE ...]

grep +[Options]+ search mode +[target file]

Options:

-O: Show only what matches to

-I: First character is case insensitive

-V: Show rows that cannot be matched to

-a#: Displays the line that is matched to the row following it

-b#: Displays the line that is matched to the # line in front of it

-c#: Displays the lines before and after the line to which it was matched

Search mode (pattern): A search condition that is combined with the metacharacters of regular expressions and normal characters.

So what is a regular expression?

Regular Expressions: Also known as regular notation (regular expression, abbreviated as regex or re), is the use of some special characters (metacharacters) to form a model that conforms to the search criteria, and has reached the purpose of finding a particular file.

What are the regular expression metacharacters and what do they do? Because there are too many regular expression meta-characters, we can classify them according to function:

The first category: characters that implement character matching, and they can be combined to match any character.

.: Matches any single character

[]: matches any single character within the specified range;

[^]: matches any single character outside any range;

For example:

[0-9] and [:d igit:] All represent any one number

[A-z] and [: Lower:] All represent any lowercase letter

[A-z] and [: Upper:] All represent any uppercase letter

[A-za-z] and [: Aphla:] All represent any one letter

[1-9a-za-z] and [: alnum:] All represent any number or letter

[:p UNCT:] denotes any punctuation

[: space:] denotes any one space character

# grep [^[:d Igit:]][[:space:]][[:aphla:]]/tmp/grep.test.txt

Indicates that there are three characters in the lookup string from/tmp/grep.test.txt, the first character is not a number, the second character is a white space, the third character is a letter line

The second type: a meta-character that implements the word-count match, and a control that is followed by the expected matching character to express the number of occurrences of the preceding character.

*: Any time

\?:1 times or 0 times

\+: More than 1 times

\{m\}: The left character appears precisely m times

\{m,n\}: At least m times, up to N times

#grep "A.*b"/tmp/grep.test.txt

A line that looks for any character that can be in the middle of an AB, for example, there are ab,acb,a34b,abbb in the line

#grep "Ca\?b"/tmp/grep.test.txt

Indicates that a appears once or does not appear at all, such as CAB,CB

#grep "Ab\+c"/tmp/grep.test.txt

Indicates that B appears at least once in a string, such as ABC,ABBC,ABBBBC

#grep "Ab\{3\}c"/tmp/grep.test.txt

Indicates that B appears three times in a string, such as ABBBC

#grep "Ab\{1,2\}c"/tmp/grep.test.txt

Indicates that B appears at least 1 times and occurs up to 2 times, such as ABC,ABBC

The third category: the meta-character of the Muding position, which specifies where the matching string is to appear in the file.

^: string appears at the beginning of the line

$: String appears at end of row

^$: Combines to indicate the beginning of Muding and the end of the line, without characters in the middle of a blank line.

#grep "^ABC"/tmp/grep.test.txt

The three characters representing ABC must appear at the beginning of the line to match

#grep "abc$"/tmp/grep.test.txt

Indicates that ABC three characters must appear at the end of the line to match

#grep "^abc$"/tmp/grep.test.txt

Indicates that the entire line is only ABC three characters characters match

Class Fourth: A meta-character of a Muding word that indicates that a string must be preceded by a character or terminated by a word.

\<: Muding Word header

\>: Muding word tail

\<pattern\>: Indicates that the string must appear as an entire word

#grep "\<ABC"/tmp/grep.test.txt

Represents a three-character characters match that appears in a string preceded by the character a followed by BC

#grep "Abc\>"/tmp/grep.test.txt

Represents a three-character characters match that appears in a string preceded by the character C ending with AB

#grep "\<passwd\>"/tmp/grep.test.txt

Indicates that the word passwd in the string matches, such as apasswd,passwd2d are not eligible

Class Fifth: Meta-characters that group words

\ (\): The string is grouped with parentheses, and the representation in parentheses appears as a whole.

In grouping mode, the content that is matched in parentheses is remembered as a variable, and these variables can be invoked with grep, which is expressed as \1,\2,\3 when referenced ...

\#: Represents the first # opening parenthesis in the pattern that appears from left and right, and what matches the closing parenthesis corresponding to it.

#grep "^the\ (\<pa.*wd\>\). *\1$"/tmp/grep.test.txt

Indicates that the word "pa.*wd" matches in the middle of the string, and the line ends with the word "pa.*wd" that matches the first time. If "pa.*wd" matches into passwd then the end of the line must appear passwd.

All things in the world are developing and progressing, and when some conditions fail to meet people's needs, human beings will create other conditions to replace it, which is development. So when grep home plus the expression does not meet the needs of people to find files, extended regular expressions and corresponding commands egrep appear, it can be said that GREP-E

Egrep: is another expression of grep-e, you can use extended regular expressions

Command usage:

Egrep [OPTIONS] PATTERN [FILE ...]

Egrep available options are the same as grep, and their differences are mainly in the use of regular expression characters, egrep more powerful, and have more features available.

Extended regular expression meta-character classification:

First class: Character matches are the same as grep

.,[],[^]

Type II: Number of times match, the backslash in front of the special symbol can be omitted

*，？ , +,{m},{m,n},{m,},{,n}

Class III: Position Muding is the same as grep

^: Beginning of the line

$: End of line

Class Fourth: The word Muding is the same as grep, and the backslash cannot save

\<: the first word

\>: suffix

Class Fifth: Grouping, preceded by a backslash can be omitted, but the quoted character of the slash is not omitted

(): Group

\1,\2,.....

Extended:

Class Sixth: A selector, expression, or meaning that matches the entire string rather than a single character before and after the selector |

|: OR

For example, a|b indicates a or b

ABC|CDE means ABC or CDE.

AB (c| c) de represents AB then small C or large C then de

# egrep "AB (c| C) de "/tmp/egrep.test.txt

Indicates that rows with ABCDE or ABCDE are found from the/tmp/egrep.test.txt file

Fgrep: Represents a fast query text line, this command cannot use regular expressions, only matches the string

Queries are faster than grep, but with a single function.

To finish these three lines of text query commands, we have to give a few examples to show you, after all, only skilled use can help memory, more practice to really master the regular expression, to learn Linux can not only talk about theory, must be more practice.

Common grep instances

(1) Multiple file queries

grep "Sort" *.grep #见文件名的匹配

(2) Row matching: Output count of matching rows

Grep-c "Data.grep" #输出文档中含有48字符的行数

(3) Display of matching rows and rows

Grep-n "Data.grep" #显示所有匹配48的行和行号

(4) Show non-matching rows

Grep-vn "Data.grep" #输出所有不包含48的行

(4) Show non-matching rows

Grep-vn "Data.grep" #输出所有不包含48的行

(5) Case sensitive

Grep-i "AB" Data.grep #输出所有含有ab或Ab的字符串的行

(6) Application of regular expressions (note: It is best to enclose regular expressions in single quotes)

grep ' [239]. ' Data.grep #输出所有含有以2, 3 or 9, and is a two-digit line

(7) Mismatch test

grep ' ^[^48] ' data.grep #不匹配行首是48的行

(8) Using extended mode matching

Grep-e ' 219|216 ' Data.grep #输出所有含有219或者216的行

Finally, with the help of Brother Ma a word to encourage comrades: Mo pity himself, but do good, MO asked the future, nothing is dry.

This article is from the "Tongluowan" blog, make sure to keep this source http://wuhf2015.blog.51cto.com/8213008/1627299

Text lookup commands and regular expressions under Linux

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More