Grep learning notes)
Table of contents
-
1. grep Introduction
-
2. grep Regular Expression metacharacter set (basic set)
-
3. Meta character extension set for egrep and grep-e
-
4. POSIX character class
-
5. grep Command Options
-
6. Instance
// Use grep to search text file
Use 'grep' to search for text files
If you want to find a string in several text files, you can use the 'grep' command. 'Grep' searches for the specified string in the text.
Suppose you are searching for a file with the string 'Magic 'in the'/usr/src/Linux/documentation' directory:
$ Grep magic/usr/src/Linux/documentation /*
Sysrq.txt: * how do I enable the magic sysrq key?
Sysrq.txt: * how do I use the magic sysrq key?
The 'sysrp.txt 'file contains this string. The sysrq function is discussed.
By default, 'grep' only searches for the current directory. If the directory contains many subdirectories, 'grep' is listed as follows:
Grep: Sound: is a directory
This may make the output of 'grep' difficult to read. There are two solutions:
- Search for subdirectories: grep-R
- Or ignore the subdirectory: grep-D skip
Of course, if you expect a lot of output, you can use the pipeline to transfer it to the 'less'. Read:
$ Grep magic/usr/src/Linux/documentation/* | less
In this way, you can read more conveniently.
Note that you must provide a file filtering method (* for searching all files *). If you forget, 'grep' will wait until the program is interrupted. If this happens, press <Ctrl C> and try again.
The following are some interesting command line parameters:
- Grep-I pattern files: searches case-insensitive. Case Sensitive by default,
- Grep-l pattern files: only names of matched files are listed,
- Grep-l pattern files: Lists unmatched file names,
- Grep-W pattern files: match only the entire word, not a part of the string (for example, match 'Magic ', not 'magical '),
- Grep-C number pattern files: the matching context displays the rows of [number,
- Grep pattern1 | pattern2 files: displays the rows matching pattern1 or pattern2,
- Grep pattern1 files | grep pattern2: displays rows that match both pattern1 and pattern2.
Here are some special symbols used for search:
- "<And"> respectively mark the start and end of a word.
For example:
- Grep man * matches 'Batman ', 'manic', 'Man ', etc,
- Grep '"<man' * matches 'manic 'and 'man', but not 'Batman ',
- Grep '"<Man">' only matches 'man ', not other strings such as 'Batman' or 'manic.
- '^': Indicates the first row of the matched string,
- '$': Indicates the end of a matched string,
- If you are not familiar with command line parameters, you can try 'grep' on the GUI, such as rexgrep. This software provides and, or, not, and other syntaxes, as well as beautiful buttons. If you only need more explicit output, try fungrep.
1. grep Introduction
Grep (Global Search Regular Expression (re) and print out the line, full search for regular expressions and print out rows) is a powerful text search tool, it can use regular expressions to search for text and print matching rows. The grep family of UNIX includes grep, egrep, and fgrep. The commands of egrep and fgrep are only slightly different from those of grep. Egrep is an extension of grep and supports more re metacharacters. fgrep is fixed grep or fast grep. They regard all the letters as words, that is, the metacharacters in a regular expression represent the literal meaning of the regular expression. They are no longer special. Linux uses GNU grep. It is more powerful and can use egrep and fgrep functions through the-G,-E,-F command line options.
Grep works like this. it searches for string templates in one or more files. If the template contains spaces, it must be referenced. All strings after the template are treated as file names. The search result is sent to the screen without affecting the content of the original file.
Grep can be used in shell scripts because grep returns a status value to indicate the search status. If the template search is successful, 0 is returned. If the search is unsuccessful, 1 is returned, if the searched file does not exist, 2 is returned. We can use these return values to automate text processing.
2. grep Regular Expression metacharacter set (basic set)
-
^
-
For example, '^ grep' matches all rows starting with grep.
-
$
-
For example, 'grep $ 'matches all rows ending with grep.
-
.
-
Match a non-linefeed character, for example, 'gr. P' matches gr followed by any character and then p.
-
*
-
Match zero or multiple previous characters, such as '* grep'. Match All one or more spaces followed by grep rows. . * Represents any character.
-
[]
-
Matches a character in a specified range, for example, '[Gg] rep' matches grep and grep.
-
[^]
-
Match a character that is not within the specified range, such as '[^ A-FH-Z] rep' match a letter that does not start with the A-R and T-Z, followed by the rep line.
-
"(..")
-
Mark matching characters, such as '"(Love")'. Love is marked as 1.
-
"<
-
Anchor specifies the start of a word, for example, '"<grep' matches a row that contains a word starting with grep.
-
">
-
Anchor specifies the end of a word, for example, 'grep "> 'matches the row containing the word ending with grep.
-
X "{M "}
-
Repeat the characters X and M, for example, '0 "{5"} 'matches the rows that contain 5 o.
-
X "{M ,"}
-
Repeat character X, at least m times, for example, 'o "{5,"} 'matches rows with at least 5 o.
-
X "{M, N "}
-
Repeated characters X, at least m times, no more than N times. For example, 'o "{5, 10"} 'matches rows of 5-10 o.
-
"W
-
Match text and number characters, that is, [A-Za-z0-9], such as: 'G "W * P' match with g followed by zero or multiple characters or numbers, followed by P.
-
"W
-
"W inversion form, matching one or more non-word characters, such as periods and periods.
-
"B
-
The word lock. For example, '"bgrepb"' only matches grep.
3. Meta character extension set for egrep and grep-e
-
+
-
Matches one or more previous characters. For example, '[A-Z] + able' matches one or more lower-case letters followed by able strings, such as loveable, enable, and disable.
-
?
-
Matches zero or multiple previous characters. For example, 'gr? P' matches gr followed by one or no characters, and then the row of P.
-
A | B | C
-
Match A, B, or C. For example, grep | sed matches grep or sed.
-
()
-
Group symbols, such as: Love (able | RS) ov + matches loveable or lovers and matches one or more ov.
-
X {m}, X {M,}, X {m, n}
-
Same role as X "{M"}, X "{M,"}, X "{M, N "}
4. POSIX character class
POSIX (the Portable Operating System Interface) adds a special character class, for example [: alnum:] is another way of writing a A-Za-z0-9 to preserve one character encoding in different countries. Put them in the [] sign to become a regular expression, such as [A-Za-z0-9] or [[: alnum:]. In Linux, grep supports POSIX character classes except fgrep.
-
[: Alnum:]
-
Character
-
[: Alpha:]
-
Character
-
[: Digit:]
-
Numeric characters
-
[: Graph:]
-
Non-empty characters (non-space and control characters)
-
[: Lower:]
-
Lowercase characters
-
[: Cntrl:]
-
Control characters
-
[: Print:]
-
Non-empty characters (including spaces)
-
[: Punct:]
-
Punctuation Marks
-
[: Space:]
-
All blank characters (new line, space, Tab)
-
[: Upper:]
-
Uppercase characters
-
[: Xdigit:]
-
Hexadecimal number (0-9, A-F, A-F)
5. grep Command Options
-
-?
-
At the same time, the upper and lower lines of matching rows are displayed? Line. For example, grep-2 pattern filename simultaneously displays the upper and lower rows of matching rows.
-
-B, -- byte-offset
-
Print the block number of the row before the matching row.
-
-C, -- count
-
Only the number of matched rows is printed, and the matching content is not displayed.
-
-F file, -- file = File
-
Extract templates from files. The empty file contains 0 templates, so nothing matches.
-
-H, -- no-filename
-
When multiple files are searched, the matching file name prefix is not displayed.
-
-I, -- ignore-case
-
Ignore case differences.
-
-Q, -- quiet
-
Cancel display. Only the exit status is returned. 0 indicates that the matched row is found.
-
-L, -- files-with-matches
-
Print the list of files matching the template.
-
-L, -- files-without-match
-
Print the list of files that do not match the template.
-
-N, -- line-Number
-
Print the row number before the matched row.
-
-S, -- silent
-
The error message about the nonexistent or unreadable file is not displayed.
-
-V, -- revert-match
-
Reverse search: Only unmatched rows are displayed.
-
-W, -- word-Regexp
-
If it is referenced by "<and">, the expression is used as a word search.
-
-V, -- version
-
Displays the software version information.
6. Instance
To make good use of the grep tool, we need to write a regular expression. Therefore, we will not explain all the functions of grep here. We will only list a few examples to illustrate how to write a regular expression.
-
$ LS-L | grep '^'
-
Filter the LS-L output content in the MPs queue and display only the rows starting with.
-
$ Grep 'test' D *
-
Display all the lines containing test in files starting with D.
-
$ Grep 'test' AA BB CC
-
The row Matching Test is displayed in the AA, BB, and CC files.
-
$ Grep '[A-Z] "{5"} 'aa
-
Display All rows of a string that contains at least five consecutive lowercase characters.
-
$ Grep 'W "(ES") T. * "1 'aa
-
If West is matched, ES is stored in the memory, marked as 1, and any characters (. *). These characters are followed by another ES ("1). If they are found, the row is displayed. If you use egrep or grep-E, you do not need to escape it by using the "character. You can directly write it as 'W (ES) T. *" 1.