[GPL] grep-Basic-practice-advanced

Source: Internet
Author: User
Tags character classes grep regular expression posix egrep

1. grep Introduction

Grep (Global Search Regular Expression (re) and print out the line, full search for regular expressions and print out rows) is a powerful text search tool,

It can use regular expressions to search for text and print matching rows. UNIX grep families include grep, egrep, and fgrep.

Basic

The commands of egrep and fgrep are only slightly different from those of grep.EgrepIs an extension of grep and supports more RESS.Metacharacters,FgrepFixed grep or fast grep. They treat all the letters as words, that is, in the Regular ExpressionMetacharactersIt indicates its literal meaning and is no longer special.

Linux uses GNU grep. It is more powerful and can use egrep and fgrep functions through the-G,-E,-F command line options.

Grep works like this. it searches for string templates in one or more files. If the template contains spaces, it must be referenced. All strings after the template are treated as file names. The search result is sent to the screen without affecting the content of the original file.

Grep can be used in shell scripts because grep returns a status value to indicate the search status. If the template search is successful, 0 is returned. If the search is unsuccessful, 1 is returned, if the searched file does not exist, 2 is returned.

We can use these return values to automate text processing.

2. grep Regular Expression metacharacter set (basic set)

^ Start of the anchor row: '^ grep' matches all rows starting with grep.

$ The End Of The Anchor row is as follows: 'grep $ 'matches all rows ending with grep.

. Match a non-linefeed character such as: 'gr. P' match gr followed by any character, followed by P.

* Match zero or multiple previous characters, for example, '* grep'. Match All one or more spaces followed by the grep line.
.*
Represents any character.

[] Matches a character in a specified range, for example, '[Gg] rep' matches grep and grep.

[^] Match a character that is not within the specified range, for example, '[^ A-FH-Z] rep' match a line that does not start with a letter that does not contain the A-F and H-Z, followed by Rep.

\ (.. \) Mark matching characters, such as '\ (love \)', and love is marked as 1.

\ <Specifies the start of a word, for example, '\ <grep' matches a row that contains a word starting with grep.

\> Anchor specifies the end of a word. For example, 'grep \> 'matches the row containing the word ending with grep.

The characters x \ {M \} are repeated for X and m times. For example, 'O \ {5 \} 'matches the rows containing 5 o.

The characters x \ {M, \} are repeated at least m times, for example, 'O \ {5, \} 'matches rows with at least 5 o.

The characters x \ {M, N \} are repeated at least m times, and must not be more than N times. For example, the line 'o \ {5, 10 \} 'matches 5--10 O.

\ W matches characters and numbers, that is, [A-Za-z0-9], for example, 'g \ W * P' matches 0 or more characters or numbers after G, then p.

The inverse form of \ W. It matches one or more non-word characters, such as periods and periods.

\ B word lock, for example, '\ bgrep \ B' only matches grep.

3. Meta character extension set for egrep and grep-e

+ Match one or more previous characters. For example, '[A-Z] + able' matches one or more lower-case letters followed by able strings, such as loveable, enable, and disable.

? Matches zero or multiple previous characters. For example, 'gr? P' matches gr followed by one or no characters, and then the row of P.

A | B | C matches a, B, or C. For example, grep | sed matches grep or sed.

() Grouping symbols, such as: Love (able | RS) ov + matches loveable or lovers and matches one or more ov.

X, X {M,}, X {m, n} act the same as X \ {M \}, X \ {M, \}, X \ {M, N \}

4. POSIX character class

POSIX (the Portable Operating System Interface) adds a special character class, for example [: alnum:] is another way of writing a A-Za-z0-9 to preserve one character encoding in different countries. Put them in the [] sign to become a regular expression, such as [A-Za-z0-9] or [[: alnum:]. In Linux, grep supports POSIX character classes except fgrep. [: Alnum:] character [: Alpha:] character [: digit:] digit character [: Graph:] non-null character (non-space, control character )[: lower:] lowercase character [: cntrl:] control character [: Print:] non-empty character (including space) [: punct:] punctuation [: Space:] all blank characters (new lines, spaces, tabs) [: Upper:] uppercase characters [: xdigit:] hexadecimal numbers (0-9, A-F, A-F)

5. grep Command Options

-? At the same time, the upper and lower lines of matching rows are displayed? Line. For example, grep-2 pattern filename simultaneously displays the upper and lower rows of matching rows. -A, -- text is equivalent to matching text, used for (binary file (standard input) matches) Error-B, -- byte-offset print match row before print the block number of the row. -C, -- count: only the number of matched rows is printed, and the matching content is not displayed. -F file, -- file = file: extract the template from the file. The empty file contains 0 templates, so nothing matches. -H, -- no-filename: when multiple files are searched, the matching file name prefix is not displayed. -I, -- ignore-case ignore case differences. -Q, -- Quiet is not displayed. Only the exit status is returned. 0 indicates that the matched row is found. -L, -- files-with-Matches: print the file list matching the template. -L, -- files-without-match print the list of files that do not match the template. -N, -- line-Number Print the row number before the matched row. -S, -- silent does not display error messages about nonexistent or unreadable files. -V, -- revert-Match: Only unmatched rows are displayed. -W, -- word-Regexp if it is referenced by \ <and \>, the expression is used as a word search. -R,
-R, -- Recursive recursively reads all files in the directory, including subdirectories. For exampleGrep-r 'pattern' TestMatch pattern in all files in the test and its subdirectories. -V, -- version: displays the software version.

6. Instance

To make good use of the grep tool, we need to write a regular expression. Therefore, we will not explain all the functions of grep here. We will only list a few examples to illustrate how to write a regular expression. $ LS-L | grep '^ a' filters the LS-L output content in the MPs queue and only displays rows starting with. $ Grep 'test' D *: displays all rows containing test in files starting with D. $ Grep 'test' aa bb cc is displayed in the AA, BB, and CC files that match the test row. $ Grep '[A-Z] \ {5 \}' AA displays all rows of strings containing five consecutive lowercase characters. $
Grep 'W \ (ES \) T. * \ 1' Aa if the West is matched, the es will be stored in the memory, marked as 1, and any characters (. *). These characters are followed by another ES (\ 1). If they are found, the row is displayed. If you use egrep or grep-E, you do not need to escape the "\" number and directly write it as 'W (ES) T. * \ 1.

7. Note

On some machines, you must use the-e parameter for logical matching (For details, refer to) grep "A | B" (match rows containing the character style "A | B) grep-e "A | B" (matching the rows containing the character style "A" or "B") Description of the-e parameter in man grep is-E treats each pattern specified as an extended regular expression (ERE). A null value for the ere matches every line. Note:
Grep command with the-e flag is the same as the egrep command, handle t that error and usage messages are different and the-s flag functions differently.

8. extended commands

Run the egrep command to search for the file retrieval mode. The egrep command searches the input file (the default value is standard input) for lines that match the pattern specified by the pattern parameter. These patterns are complete regular expressions, just like in the Ed command (except \ (backslash) and \ (double backslash )). The following rules are also applied to the egrep command: * a regular expression with a + (plus sign) will match one or more regular expressions. * A regular expression is followed by
? (Question mark) matches zero or one regular expression. * Multiple regular expressions separated by | (vertical line) or line breaks match strings that match any regular expression. * A regular expression can be included in "()" (ARC) for grouping. Line breaks will not be matched by regular expressions. The priority of operators is [,], *,?, +, Merge, | and line break. Note: The egrep command is the same as the grep command with the-e flag, except for the difference between the error message and the use of the message and the function of the-s flag. The egrep command will display the file containing the matched line, if you specify more than one file parameter. Pair
Shell characters ($, *, [, |, ^, (,), \) with double quotation marks must appear in the pattern parameter. If the pattern parameter is not a simple string, you must enclose the entire pattern in single quotes. In an expression, for example, [A-Z], minus signs indicate that the current sorting sequence is used. Sorting sequences can define equivalent classes for use in character ranges. It uses a fast and deterministic algorithm and sometimes requires external space. The fgrep command searches for text strings for files. The fgrep command is used to search for matching mode lines in the input file specified by the file parameter (standard input by default. The fgrep command specifically searches for Pattern
Parameters, which are fixed strings. If you specify more than one file in the file parameter, the fgrep command will display the file containing Matching lines. The fgrep command is different from the grep and egrep commands because it searches for strings instead of matching expression patterns. The fgrep command uses a fast compression algorithm. $, *, [, |, (,), \, And other strings are interpreted literally by the fgrep command. These characters are not interpreted as regular expressions, but they are interpreted as regular expressions in grep and egrep commands. Because these characters have specific meanings for shell, the complete string should be enclosed in single quotation marks ('...')..
If no file is specified, the fgrep command assumes standard input. Generally, each row is copied to the standard output. If there are more than one input file, the file name is printed before each row.

Zookeeper ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.