grep (Global search Regular expression (RE) and print out of the line, a comprehensive search for regular expressions and print out rows) is a powerful text search tool that uses regular expressions to search for text. and print out the matching rows. The grep family of Unix includes grep, Egrep, and Fgrep. Egrep and Fgrep commands are only a small difference from grep. Egrep is an extension of grep that supports more re metacharacters, Fgrep is fixed grep or fast grep, which regards all letters as words, that is, the metacharacters in the regular expression returns to its own literal meaning and is no longer special. Linux uses the GNU version of grep. It is more powerful and can use the EGREP and FGREP functions through the-G,-E,-F command-line Options.
grep works like this by searching for a string template in one or more files. If the template includes spaces, it must be referenced, and all strings after the template are treated as file names. The results of the search are sent to the screen without affecting the contents of the original file.
grep is available for shell scripts because grep indicates the state of the search by returning a status value, returns 0 if the template search succeeds, or 1 if the search is unsuccessful, and returns 2 if the searched file does not exist. We can use these return values to do some automated text processing work.
2. grep regular expression meta-character set (base set)
^
The beginning of the anchoring line, such as: ' ^grep ' matches all rows beginning with grep.
$
The end of the anchoring line, such as: ' grep$ ', matches all rows that end with grep.
Match a newline character Furu: ' GR.P ' matches the GR followed by an arbitrary character followed by P.
*
Match 0 or more previous characters Furu: ' *grep ' matches all one or more spaces followed by the grep row ... * together to represent any character.
[]
Matches a specified range of characters, such as ' [Gg]rep ' matches grep and grep.
[^]
Matches a character that is not in the specified range, such as: ' [^a-fh-z]rep ' match does not contain a letter beginning with A-r and T-z, followed by the rep line.
\(.. \)
Tag matching characters, such as ' \ (love\ '), Love is marked as 1.
\<
Anchor the beginning of a word, such as: ' \\>
Anchors the end of a word, such as ' grep\> ' to match a line containing a word that ends with grep.
X\{m\}
Repeat characters x,m times, such as: ' 0\{5\} ' matches rows containing 5 O.
X\{m,\}
Repeat character X, at least m times, such as: ' O\{5,\} ' matches rows with at least 5 O.
X\{m,n\}
Repeat character X, at least m times, no more than n times, such as: ' O\{5,10\} ' matches 5--10 O's line.
\w
Matches literal and numeric characters, that is, [a-za-z0-9], such as: ' G\w*p ' matches with G followed by 0 or more text or number characters, followed by P.
\w
\w, which matches one or more non word characters, such as the dot number period.
\b
Word locks, such as: ' \bgrepb\ ' only match grep.
3. Meta-character extension set for Egrep and GREP-E
+
Matches one or more of the previous characters. such as: ' [a-z]+able ', match one or more lowercase letters followed by the able string, such as loveable,enable,disable.
?
Matches 0 or more previous characters. For example: ' Gr?p ' matches a gr followed by one or no characters, then a line of P.
A|b|c
Match A or B or C. such as: grep|sed match grep or SED
()
Group symbols, such as: Love (able|rs) ov+ match loveable or lovers, matching one or more ov.
X{m},x{m,},x{m,n}
function with x\{m\},x\{m,\},x\{m,n\}
4. POSIX character class
POSIX (The Portable operating System Interface) adds special character classes, such as [: Alnum:], to keep one to the character encodings in different countries. To put them inside the [] number, you can become regular expressions, such as [a-za-z0-9] or [[: Alnum:]]. Under Linux, grep supports the POSIX character classes except Fgrep.
[: Alnum:]
Literal numeric character
[: Alpha:]
Literal characters
[:d Igit:]
numeric characters
[: Graph:]
Non-null characters (not spaces, control characters)
[: Lower:]
lowercase characters
[: Cntrl:]
Control characters
[:p rint:]
Non-null characters (including spaces)
[:p UNCT:]
Punctuation
[: Space:]
All white-space characters (new lines, spaces, tabs)
[: Upper:]
Uppercase characters
[: Xdigit:]
hexadecimal digits (0-9,a-f,a-f)
5. grep command Options
-?
Display the matching row up and down? Rows, such as: grep-2 pattern filename Displays the top and bottom 2 rows of matching rows at the same time.
-b,--byte-offset
Print the block number in which the line is printed before the matching line.
-C,--Count
Prints only the number of rows that match and does not display the matching content.
-F File,--file=file
Extracts the template from the file. The empty file contains 0 templates, so nothing matches.
-h,--no-filename
When searching for multiple files, the matching filename prefix is not displayed.
-i,--ignore-case
ignores case differences.
-q,--quiet
Suppresses display and returns only the exit status. 0 indicates that a matching row was found.
-l,--files-with-matches
Print a list of files that match the template.
-l,--files-without-match
Print a list of files that do not match the template.
-n,--line-number
Print the line number before the matching line.
-s,--silent
Does not display error messages about the absence or inability to read files.
-v,--revert-match
Reverse retrieve, showing only rows that do not match.
-w,--word-regexp
If referenced by \< and \>, the expression is searched as a word.
-v,--version
Displays software version information.
6. Examples
To use a good grep this tool, in fact, is to write a good regular expression, so here is not the grep all the features of the example to explain, only a few examples, explain a regular expression of the wording.
$ ls-l | grep ' ^a '
Filters the contents of the LS-L output through a pipe, showing only the rows that start with a.
$ grep ' test ' d*
Displays the rows that contain test in all files that start with D.
$ grep ' test ' AA bb cc
Displays the row that matches test in the aa,bb,cc file.
$ grep ' [a-z]\{5\} ' AA
Displays all the lines that contain at least 5 consecutive lowercase characters for each string.
$ grep ' w\ (es\) t.*\1 ' AA
If West is matched, es is stored in memory and labeled 1, then searches for any character (. *), followed by another ES (\1), and the line is displayed. If you use Egrep or GREP-E, you do not use the "\" to escape, directly written ' W (es) t.*\1 ' on it.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.