Regular expressions and grep commands

Source: Internet
Author: User
Tags glob uppercase character expression engine egrep

First, Introduction

Introduction two names in the Linux system:grep,egrep. To use these 2 commands first learn to use regular expressions, before you introduce regular expressions, let's say that everyone is familiar with the wildcard characters used in Word, namely:

*: Denotes any character of any length.

? : Represents any single character.

Remembering the meaning of the two characters above will also appear in regular expressions, but with different meanings.

1.Regular Expressions:Regular Expression,Regexp

A pattern written by a class of special characters and text characters, some of which do not represent literal meanings, but are used to denote the function of control or distribution;

Metacharacters are divided into two categories:

Basic Regular Expressions: BRE

an extended regular expression: Ere

The regular expression engine is a program that uses regular expression patterns to parse a given text, and has a clear advantage in the text search of hundreds of millions of units.

2.Three typesGrepCommand

(1)grep:Global search REgular expression and Print out of the line; BRE is supported by default The basic regular expression;

(2)egrep: Supports the use of extended regular expressions;

(3)fgrep: The use of regular expressions is not supported;

Second,GrepCommand

function: Text Search tool, according to user-specified pattern(filter condition) to match the target text line by row to check, print out the qualifying line;

Pattern: The filter condition written by text character and regular expression meta-character;

1.Syntax Format

grep [OPTIONS] PATTERN [FILE ...]

grep [OPTIONS] [-E PATTERN |-f file][file ...]

2.Options

--color=auto: Color highlighting of matched text;

-I: ignore character case;

-O: Displays only the text that matches to itself; (default displays the entire line of text that matches to)

-V,--invert-match: reverse match;

-e,--extened-regexp: Supports extended regular expressions, equivalent to egrep commands, useful for grep and fgrep ;

-F,--fixed-strings: equivalent to fgrep command;

-Q,--quiet,--silent: Silent mode, do not output any information, take command execution status code is often used;

-G,--basic-regexp: Supports the use of basic regular expressions and is useful for egrep and fgrep ;

-P,--perl-regexp: Supports the use of pcre Regular expressions (supports a lot of meta-characters, very powerful);

-E pattern,--regexp=pattern: Use multi-mode;

-F file,--file=:file for each line contains a pattern of text files, that is, grepscript; write the pattern in a file, By reading the file (script file), to match;

-C: Displays the number of rows to which the statistic matches;

-R,--recursive: The contents of all files under the directory are matched according to the pattern;

-A #,--after-context=#: Indicates that the line is displayed after the line is matched to the row ;

-B #,--before-context=#: Indicates that the lineis displayed in front of the line ;

-C #,-#,--context=#: Indicates that the lineis displayed before and after the line ;

For example:

~]# grep ' root '/etc/passwd: If the pattern has a variable to use double quotation marks;

~]# grep-v ' root '/etc/passwd: reverse match;

~]# grep-i ' root '/etc/passwd: ignores character capitalization;

~]# grep-o ' root '/etc/passwd: Displays only the text that matches to itself;

~]# grep-q ' root '/etc/passwd: silent mode;

~]# echo $? The command execution status code is often used;

For example:

~]# grep-e "R.. T "-e" bash "/etc/passwd: Using multi-mode;

~]# grep-f/root/test/mypat/etc/passwd: Using the pattern saved in the file;

~]# grep-a 1 "^[op]"/etc/passwd: Displays the row following the matching line;

~]# grep-b 2 "^[op]"/etc/passwd: Displays the two lines preceding the line that match to;

~]# grep-c 1 "^[op]"/etc/passwd: Shows the row before and after the matching line;

The pattern is enclosed in quotation marks, with single quotes when there are variables;

three, basic regular expression meta-character1.Character Matching

. : matches any single character;

[]: matches any single character within the range;( same as glob mechanism )

[^] : Matches a single character outside the range;

[:d Igit:] : any single number;

[: Lower:] : Any single lowercase letters:

[: Upper:] : any single uppercase character;

[: Alpha:] : Any single letter;

[: Alnum:] : Any single letter and number

[: Space:] : any single whitespace character;

[: Blank:] : Any single space and tab;

[:p UNCT:] : any single punctuation mark;

[: Cntrl:] : any single control character;

[: Graph:] : Any single symbol that can be displayed;

[:p rint:] : any single printable symbol;

[: Xdigit:] : any single hexadecimal character;

Man 7 glob view character set range;

For example:

~]# Ifconfig | grep "R.." :R followed by a two-character line;

~]# Ifconfig | Grep-i "I[a-z][a-z]": Case-insensitive,I followed by a two-letter line;

~]# Ifconfig | grep "I[[:alpha:]][[:space:]]":i followed by a letter and followed by a space line;

2.Number of matches

Used to specify the number of occurrences of the character after which to limit the number of characters preceding it to appear, the default work in greedy mode;

*: Matches any number of occurrences of the preceding character (0,1 or more);

grep "X*y": As long as there is Y match;

Xxxyabc

Yab

Abcxy

Abcy

. *: Matches any character of any length, equivalent to * in glob;

grep "X.*y": any character that matches between x and Y can occur at any length;

Xxxyabc

Abcxy

\+: Matches the preceding character at least 1 times (1 or more times);\ is an escape character;

grep "X\+y": an x must appear before y;

Xxxyabc

Abcxy

\? : Matches the preceding character 0 or 1 times, that is, the preceding character is optional;

grep "X\?y": As long as there is Y match;

Xxxyabc

Yab

Abcxy

Abcy

\{m\}: Matches the preceding characters appear m Times,M is a nonnegative integer;

grep "X\{2\}y": before y appears 2 times x is matched;

Xxxyabc

\{m,n\}: Matches the preceding character appears M Times,M is a nonnegative integer; closed interval [m,n]

\{0,n\}: up to n times;

\{m,\}: At least m times;

For example:

~]# Ifconfig | grep "i[[:alpha:]]\{3\}": Matches the line I followed by 3 letters;

~]# Ifconfig | grep "I[[:alpha:]]\{3,\}": Matches an I followed by at least 3 letters of the line;

3.Position anchoring

Restricts the use of pattern search text, which restricts the text that the pattern matches to where it appears only in the target text;

^: Anchor at the beginning of the line; for the leftmost side of the pattern,^pattern;

$: End of line anchoring; for the right side of the pattern,pattern$;

^pattern$: To make PATTERN exactly match a whole line;

^$: Match blank line;

^[[:space:]].*$: Match blank line;

For example:

~]# grep "^r. T "/etc/passwd: Match R begins with two characters followed by a t line;

~]# grep "L.\{3\}n"/etc/passwd: Matches l followed by 3 characters and followed by n lines;

~]# grep "l.\{3\}n$"/etc/passwd: Matches the line with the L followed by 3 characters and ends with n ;

~]# grep "^l.\{3\}n$"/etc/passwd: The match can only be a line that begins with an L followed by 3 characters and ends with n ;

~]# grep "[[: space:]]\+]/etc/passwd: Matches a row with at least one space in succession;

Word: A continuous character (string) consisting of a non-special character is called a word;

\< or \b: The first anchor of the word, used for the left side of the word pattern, in the form of \<pattern,/bpattern;

\> or \b: The ending anchor, used for the right side of the commitment pattern, in the form of pattern\>,pattern\b;

For example:

~]# grep "\<r. T "/etc/passwd: Matches the word head:R followed by two characters followed by the T line;

~]# grep "\<r. T\> "/etc/passwd: Match the word:R followed by two characters followed by the T line;

~]# Ifconfig | grep "\<[0-9]\{3\}\>": Match words: three digits;

4.grouping and referencing

\ (pattern\): The character matching this PATTERN is treated as an integral whole;

Note: The characters that match patterns in the grouping brackets are automatically recorded by the regular expression engine in internal variables, which are \1,\2,\3,...

Example:pat1\ (pat2\) pat3\ (pat4\ (PAT5) pat6\)

\ n: The string that matches the pattern between the nth opening parenthesisin the pattern and the matching closing parenthesis (not the pattern, but the result of the pattern match)

\1: The string that represents the total PATTERN of the first set of parentheses; the previous example:PAT2

\2: The string that represents the total PATTERN of the second set of parentheses, as in the previous example:pat4\ (PAT5) pat6

\3: The string that represents the total PATTERN of the third set of parentheses, as in the previous example:PAT5

...

Example:

He love his lover

He like his lover

He love his liker

He like his liker

. *l. E.*l. Er

\ (L.. e\). *\1r

~]# Grep-o ' L. E.*l. Er ' fenzu_pattern.txt: cannot complete exact match;

~]# grep-o ' \ (L.. e\). *\1r ' Fenzu_pattern.txt: The group can be accurately matched;

Back reference: Refers to the string that matches the pattern in the preceding parentheses;

Four,EgrepCommand

grep commands that support the use of extended regular expressions , equivalent to grep-e;

1.Syntax Format

Egrep [OPTIONS] PATTERN [FILE ...]

2.Options

option with grep;

metacharacters of extended regular expressions: No escape characters required1.Character Matching

. : matches any single character;

[]: matches any single character within the range;

[^] : Matches a single character outside the range;

[:d Igit:] : any single number;

[: Lower:] : Any single lowercase letters:

[: Upper:] : any single uppercase character;

[: Alpha:] : Any single letter;

[: Alnum:] : Any single letter and number

[: Space:] : Any single space;

[: Blank:] : Any single space and tab

[:p UNCT:] : any single punctuation mark;

2.Number of matches

*: Matches the preceding character (optional) any time (0,1 or more);

? : Matches the preceding character 0 or 1 times, that is, the preceding character is optional;

+: Matches the preceding character at least 1 times (1 or more times);

{m} : Matches the preceding character , M, andm is a non-negative integer;

{M,n} : Matches the preceding character with M Times,M is a nonnegative integer;[M,n]

{0,n} : up to n times;

{m,} : At least m times;

3.Position anchoring

^: Anchor at the beginning of the line; for the leftmost side of the pattern,^pattern;

$: End of line anchoring; for the right side of the pattern,pattern$;

^pattern$: To make PATTERN exactly match a whole line;

^$: Match blank line

\<,\b: The first anchor of the word, used for the left side of the word pattern, in the form of \<pattern,/bpattern;

\>,\b: The ending anchor, used for the right side of the commitment pattern, in the format pattern\>,pattern\b;

4.grouping and referencing

(pattern) : Grouping, in which the pattern in parentheses matches to a character that is stored in a variable inside the hermetical expression engine;

back reference:\1,\2,\3,...

5.or

A|b:a or b

C|cat: denotes C or cat;

(C|c) at: Indicates cat or cat;


Regular expressions and grep commands

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.