Linuxgrep command usage and Regular Expression
1. Introduction to grep commands and Regular Expressions
(1 ). grep (Global search REgular expression and Print out the line) is a powerful text search tool in Linux, it filters the target text based on the pattern specified by the user and displays the rows matched by the pattern;
(2) Regular Expressions are written by a certain type of characters. Some of them do not represent the literal meaning of a character, but represent the function of control or wildcard.
2. Basic syntax format of the grep command
Grep [OPTION]... 'pattern' FILE...
Common grep options:
-V: reverse returns the matched rows.
-O: Only Matching content is displayed.
-I: case-insensitive characters
-N: Add a row number to the matched row.
-E: use an extended regular expression, equivalent to the egrep command.
-F: does not use regular expressions to search. It is equivalent to the fgrep command.
-A #: it is displayed along with the lower # Rows of matching rows. # It represents any number.
-B #: it is displayed together with the top of the matching row # The row # represents any number.
-C #: it is displayed together with the upper and lower lines matching the row # The row # represents any number.
-- Color = auto: the Matching content is displayed in different colors.
3. Basic usage of grep Regular Expressions
Basic regular expression:
(1) character matching
.: Match any single character
For example, matching rows that start with r and end with t are separated by only two characters
[]: Match any single character in the specified set
Common collection representation methods include:
Pure number: [[: digit:] or [0-9]
Lowercase letter: [[: lower:] or [a-z]
Uppercase letters: [[: upper:] or [A-Z]
Uppercase/lowercase letters: [[: alpha:] or [a-zA-Z]
Numbers plus letters: [[: alnum:] or [0-9a-zA-Z]
Blank character: [[: space:]
Punctuation: [[: punct:]
Example 1: Match rows that contain numbers 0 or 2 (only the first half)
Example 2: match the row containing the letter r or t (only the first half)
Example 3: match the row containing the number 0-9 (only the first half)
[^]: Match any single character outside the specified set
For example, match the rows containing characters other than the 1-9 range (only including the first half)
(2) times matching
*: Match the first character to appear any time, 0, 1 or multiple rows
For example, create a test text that contains the following content:
Match any number of rows with letters x:
\ +: Match the line with the previous character that appears once or multiple times
Example: Match rows with at least one x character
\? : Match the line with the first character 0 or 1
For example, a row that matches letters x 0 or once appears
\ {M \}: The line that matches the character above it appears m times
Example: match the rows with letters x twice
\ {M, n \}: match the first character to appear at least m times, at most n rows, m and n represent a range of m-n
For example, a row that matches letters x at least once and appears at most 3 times
(3) Positioning
^: First line anchored
For example, match the rows with letters x at the beginning of the row.
$: Anchor at the end of a row
For example, the row that matches the letter e at the end of the row
^ $: Matches blank rows
Example: Match blank rows
\ <: Specifies the beginning of a word.
For example, exact match of xy two letters in the first line of a word
\>: Tail anchor
For example, exact match with the line at the end of a word with two xy letters
\ <\>: Match a word
For example, match the row containing the word xy.
(3) Group
\ (\): Grouping and matching a string
For example, match one or zero rows in xy single startup
Backward reference: In the mode, if \ (\) is used to implement grouping, in the check of a line of text, if the \ (\) mode matches a certain content, the mode after this content can be referenced;
The symbols referenced in the previous group are \ 1, \ 2, \ 3.
The Mode starts from left to right and references the # Left brackets and the Pattern Matching content between them;
Back Reference example:
Create a text file with the following content:
Find the rows with the same word before and after:
RegEx metacharacters:
Character match:., [], [^]
Matching times :*,\? , \ +, \ {M \}, \ {m, n \}
Position anchored: ^, $, \ <, \>, \ <\>
Group match :\(\)
4. egrep and extended regular expressions:
Egrep is equivalent to grep-E. egrep can directly use an extended regular expression, while grep needs to add option-E;
Metacharacters of the extended regular expression:
Character match:., [], [^]
Matching times :*,?, +, {M}, {m, n}, {m,}, {0, n}
Positioning: ^, $, \>, \ <
Group match: (), supports backward reference
|: Match rows that meet the condition on the left or right, such as a | B. All rows that contain a or B match;
Example 1: egrep is equivalent to grep-E
Example 2:
5. grep exercise questions:
(1). display the rows starting with uppercase or lowercase s in the/proc/meminfo file;
# Grep-I '^ s'/proc/meminfo
(2). display users whose default shell is not/sbin/nologin in the/etc/passwd file;
# Grep-v '/sbin/nologin $'/etc/passwd | cut-d:-f1
(3). display the/etc/passwd file whose default shell is/bin/bash.
Further: only the user with the largest ID in the preceding results is displayed.
# Grep '/bin/bash $'/etc/passwd | cut-d:-f1 | sort-n-r | head-1
(4) find one or two digits in the/etc/passwd file;
# Grep '\ <[[: digit:] \ {1, 2 \} \>'/etc/passwd
(5) display at least one line starting with a blank character in/boot/grub. conf
# Grep '^ [[: space:] \ +. *'/boot/grub. conf
(6) display the/etc/rc. d/rc. sysinit file, which starts with #, followed by at least one blank character, and then contains at least one non-blank line;
# Grep '^ # [[: space:] \ + [^ [: space:] \ +'/etc/rc. d/rc. sysinit
(7) Find the line containing 'listen' in the execution result of the netstat-tan command;
# Netstat-tan | grep 'Listen [[: space:] * $
(8) Add users bash, testbash, basher, and nologin (SHELL:/sbin/nologin), and find the users with the same username as the default SHELL on the current system;
# Grep '\ (\ <[: alnum:] \ + \> \). * \ 1 $'/etc/passwd
(9). extended question: Create a text file with the following content:
He like his lover.
He love his lover.
He like his liker.
He love his liker.
Find the row where the last word is formed by adding r to the previous word;
# Grep '\ (\ <[: alpha:] \ + \> \). * \ 1r 'grep.txt
(10) display the default shell and User Name of the root, centos, or user1 user on the current system;
# Grep-E '^ (root | centos | user1 \>)'/etc/passwd
(11). Find the line with parentheses '() "after the word in the/etc/rc. d/init. d/functions file;
# Grep-o '\ <[: alpha:] \ + \> ()'/etc/rc. d/init. d/functions
(12). Use echo to output a path, and use egrep to retrieve its base name;
# Echo/etc/rc. d/| grep-o '[^/] \ + /\? $ '| Grep-o' [^/] \ +'