First, Introduction
Introduction two names in the Linux system:grep,egrep. To use these 2 commands first learn to use regular expressions, before you introduce regular expressions, let's say that everyone is familiar with the wildcard characters used in Word, namely:
*: Denotes any character of any length.
? : Represents any single character.
Remembering the meaning of the two characters above will also appear in regular expressions, but with different meanings.
1.Regular Expressions:Regular Expression,Regexp
A pattern written by a class of special characters and text characters, some of which do not represent literal meanings, but are used to denote the function of control or distribution;
Metacharacters are divided into two categories:
Basic Regular Expressions: BRE
an extended regular expression: Ere
The regular expression engine is a program that uses regular expression patterns to parse a given text, and has a clear advantage in the text search of hundreds of millions of units.
2.Three typesGrepCommand
(1)grep:Global search REgular expression and Print out of the line; BRE is supported by default The basic regular expression;
(2)egrep: Supports the use of extended regular expressions;
(3)fgrep: The use of regular expressions is not supported;
Second,GrepCommand
function: Text Search tool, according to user-specified pattern(filter condition) to match the target text line by row to check, print out the qualifying line;
Pattern: The filter condition written by text character and regular expression meta-character;
1.Syntax Format
grep [OPTIONS] PATTERN [FILE ...]
grep [OPTIONS] [-E PATTERN |-f file][file ...]
2.Options
--color=auto: Color highlighting of matched text;
-I: ignore character case;
-O: Displays only the text that matches to itself; (default displays the entire line of text that matches to)
-V,--invert-match: reverse match;
-e,--extened-regexp: Supports extended regular expressions, equivalent to egrep commands, useful for grep and fgrep ;
-F,--fixed-strings: equivalent to fgrep command;
-Q,--quiet,--silent: Silent mode, do not output any information, take command execution status code is often used;
-G,--basic-regexp: Supports the use of basic regular expressions and is useful for egrep and fgrep ;
-P,--perl-regexp: Supports the use of pcre Regular expressions (supports a lot of meta-characters, very powerful);
-E pattern,--regexp=pattern: Use multi-mode;
-F file,--file=:file for each line contains a pattern of text files, that is, grepscript; write the pattern in a file, By reading the file (script file), to match;
-C: Displays the number of rows to which the statistic matches;
-R,--recursive: The contents of all files under the directory are matched according to the pattern;
-A #,--after-context=#: Indicates that the line is displayed after the line is matched to the row ;
-B #,--before-context=#: Indicates that the lineis displayed in front of the line ;
-C #,-#,--context=#: Indicates that the lineis displayed before and after the line ;
For example:
~]# grep ' root '/etc/passwd: If the pattern has a variable to use double quotation marks;
~]# grep-v ' root '/etc/passwd: reverse match;
~]# grep-i ' root '/etc/passwd: ignores character capitalization;
~]# grep-o ' root '/etc/passwd: Displays only the text that matches to itself;
~]# grep-q ' root '/etc/passwd: silent mode;
~]# echo $? The command execution status code is often used;
For example:
~]# grep-e "R.. T "-e" bash "/etc/passwd: Using multi-mode;
~]# grep-f/root/test/mypat/etc/passwd: Using the pattern saved in the file;
~]# grep-a 1 "^[op]"/etc/passwd: Displays the row following the matching line;
~]# grep-b 2 "^[op]"/etc/passwd: Displays the two lines preceding the line that match to;
~]# grep-c 1 "^[op]"/etc/passwd: Shows the row before and after the matching line;
The pattern is enclosed in quotation marks, with single quotes when there are variables;
three, basic regular expression meta-character1.Character Matching
. : matches any single character;
[]: matches any single character within the range;( same as glob mechanism )
[^] : Matches a single character outside the range;
[:d Igit:] : any single number;
[: Lower:] : Any single lowercase letters:
[: Upper:] : any single uppercase character;
[: Alpha:] : Any single letter;
[: Alnum:] : Any single letter and number
[: Space:] : any single whitespace character;
[: Blank:] : Any single space and tab;
[:p UNCT:] : any single punctuation mark;
[: Cntrl:] : any single control character;
[: Graph:] : Any single symbol that can be displayed;
[:p rint:] : any single printable symbol;
[: Xdigit:] : any single hexadecimal character;
Man 7 glob view character set range;
For example:
~]# Ifconfig | grep "R.." :R followed by a two-character line;
~]# Ifconfig | Grep-i "I[a-z][a-z]": Case-insensitive,I followed by a two-letter line;
~]# Ifconfig | grep "I[[:alpha:]][[:space:]]":i followed by a letter and followed by a space line;
2.Number of matches
Used to specify the number of occurrences of the character after which to limit the number of characters preceding it to appear, the default work in greedy mode;
*: Matches any number of occurrences of the preceding character (0,1 or more);
grep "X*y": As long as there is Y match;
Xxxyabc
Yab
Abcxy
Abcy
. *: Matches any character of any length, equivalent to * in glob;
grep "X.*y": any character that matches between x and Y can occur at any length;
Xxxyabc
Abcxy
\+: Matches the preceding character at least 1 times (1 or more times);\ is an escape character;
grep "X\+y": an x must appear before y;
Xxxyabc
Abcxy
\? : Matches the preceding character 0 or 1 times, that is, the preceding character is optional;
grep "X\?y": As long as there is Y match;
Xxxyabc
Yab
Abcxy
Abcy
\{m\}: Matches the preceding characters appear m Times,M is a nonnegative integer;
grep "X\{2\}y": before y appears 2 times x is matched;
Xxxyabc
\{m,n\}: Matches the preceding character appears M Times,M is a nonnegative integer; closed interval [m,n]
\{0,n\}: up to n times;
\{m,\}: At least m times;
For example:
~]# Ifconfig | grep "i[[:alpha:]]\{3\}": Matches the line I followed by 3 letters;
~]# Ifconfig | grep "I[[:alpha:]]\{3,\}": Matches an I followed by at least 3 letters of the line;
3.Position anchoring
Restricts the use of pattern search text, which restricts the text that the pattern matches to where it appears only in the target text;
^: Anchor at the beginning of the line; for the leftmost side of the pattern,^pattern;
$: End of line anchoring; for the right side of the pattern,pattern$;
^pattern$: To make PATTERN exactly match a whole line;
^$: Match blank line;
^[[:space:]].*$: Match blank line;
For example:
~]# grep "^r. T "/etc/passwd: Match R begins with two characters followed by a t line;
~]# grep "L.\{3\}n"/etc/passwd: Matches l followed by 3 characters and followed by n lines;
~]# grep "l.\{3\}n$"/etc/passwd: Matches the line with the L followed by 3 characters and ends with n ;
~]# grep "^l.\{3\}n$"/etc/passwd: The match can only be a line that begins with an L followed by 3 characters and ends with n ;
~]# grep "[[: space:]]\+]/etc/passwd: Matches a row with at least one space in succession;
Word: A continuous character (string) consisting of a non-special character is called a word;
\< or \b: The first anchor of the word, used for the left side of the word pattern, in the form of \<pattern,/bpattern;
\> or \b: The ending anchor, used for the right side of the commitment pattern, in the form of pattern\>,pattern\b;
For example:
~]# grep "\<r. T "/etc/passwd: Matches the word head:R followed by two characters followed by the T line;
~]# grep "\<r. T\> "/etc/passwd: Match the word:R followed by two characters followed by the T line;
~]# Ifconfig | grep "\<[0-9]\{3\}\>": Match words: three digits;
4.grouping and referencing
\ (pattern\): The character matching this PATTERN is treated as an integral whole;
Note: The characters that match patterns in the grouping brackets are automatically recorded by the regular expression engine in internal variables, which are \1,\2,\3,...
Example:pat1\ (pat2\) pat3\ (pat4\ (PAT5) pat6\)
\ n: The string that matches the pattern between the nth opening parenthesisin the pattern and the matching closing parenthesis (not the pattern, but the result of the pattern match)
\1: The string that represents the total PATTERN of the first set of parentheses; the previous example:PAT2
\2: The string that represents the total PATTERN of the second set of parentheses, as in the previous example:pat4\ (PAT5) pat6
\3: The string that represents the total PATTERN of the third set of parentheses, as in the previous example:PAT5
...
Example:
He love his lover
He like his lover
He love his liker
He like his liker
. *l. E.*l. Er
\ (L.. e\). *\1r
~]# Grep-o ' L. E.*l. Er ' fenzu_pattern.txt: cannot complete exact match;
~]# grep-o ' \ (L.. e\). *\1r ' Fenzu_pattern.txt: The group can be accurately matched;
Back reference: Refers to the string that matches the pattern in the preceding parentheses;
Four,EgrepCommand
grep commands that support the use of extended regular expressions , equivalent to grep-e;
1.Syntax Format
Egrep [OPTIONS] PATTERN [FILE ...]
2.Options
option with grep;
metacharacters of extended regular expressions: No escape characters required1.Character Matching
. : matches any single character;
[]: matches any single character within the range;
[^] : Matches a single character outside the range;
[:d Igit:] : any single number;
[: Lower:] : Any single lowercase letters:
[: Upper:] : any single uppercase character;
[: Alpha:] : Any single letter;
[: Alnum:] : Any single letter and number
[: Space:] : Any single space;
[: Blank:] : Any single space and tab
[:p UNCT:] : any single punctuation mark;
2.Number of matches
*: Matches the preceding character (optional) any time (0,1 or more);
? : Matches the preceding character 0 or 1 times, that is, the preceding character is optional;
+: Matches the preceding character at least 1 times (1 or more times);
{m} : Matches the preceding character , M, andm is a non-negative integer;
{M,n} : Matches the preceding character with M Times,M is a nonnegative integer;[M,n]
{0,n} : up to n times;
{m,} : At least m times;
3.Position anchoring
^: Anchor at the beginning of the line; for the leftmost side of the pattern,^pattern;
$: End of line anchoring; for the right side of the pattern,pattern$;
^pattern$: To make PATTERN exactly match a whole line;
^$: Match blank line
\<,\b: The first anchor of the word, used for the left side of the word pattern, in the form of \<pattern,/bpattern;
\>,\b: The ending anchor, used for the right side of the commitment pattern, in the format pattern\>,pattern\b;
4.grouping and referencing
(pattern) : Grouping, in which the pattern in parentheses matches to a character that is stored in a variable inside the hermetical expression engine;
back reference:\1,\2,\3,...
5.or
A|b:a or b
C|cat: denotes C or cat;
(C|c) at: Indicates cat or cat;
Regular expressions and grep commands