Three minutes to learn about regular expressions in the grep command
In Linux and Unix-like operating systems, how does one use the grep command with a regular expression?
Linux provides the GNU grep tool that supports extended regular expressions. By default, GNU grep is installed in all Linux systems. The grep command is used to retrieve text information from any location on a server or workstation.
I. Quick introduction to Regular Expressions
1. How to match the content you want to search?
A regular expression is only a pattern that matches each input line. Mode is a character sequence. The following are examples:
For example, "^ w1", "w1 | w2", and "[^]".
Search for 'vivek' in '/etc/passswd '.
Grep vivek/etc/passwd
Output result case:
Vivek: x: 1000: 1000: Vivek Gite,:/home/vivek:/bin/bash
Vivekgite: x: 1001: 1001:/home/vivekgite:/bin/sh
Gitevivek: x: 1002: 1002:/home/gitevivek:/bin/sh
Search for 'vivek' in any situation (that is, no size difference ):
Grep-I-w vivek/etc/passwd
Retrieve 'vivek' and 'l1' case-insensitive ':
Grep-E-I-w 'vivek | r1'/etc/passwd
In the last example, the extended regular expression mode is used.
Fixed search content location:
You can use the ^ and $ symbols to force a regular expression to match the start or end position of a row. The following example shows the text starting with 'vivek.
Grep ^ vivek/etc/passwd
Output result example:
Vivek: x: 1000: 1000: Vivek Gite,:/home/vivek:/bin/bash
Vivekgite: x: 1001: 1001:/home/vivekgite:/bin/sh
You can only display text lines starting with vivek. For example, it means that the start of a word such as vivekgite and vivekg is not displayed.
Grep-w ^ vivek/etc/passwd
Retrieve the text format ending with 'foo:
Grep 'foo $ 'FILENAME
You can also use the following method to search for blank rows:
Grep '^ $' FILENAME
2. How to match specific characters?
Match 'vivek' or 'vivek ':
Grep '[vV] ivek' FILENAME
Alternatively, you can:
Grep '[vV] [iI] [Vv] [Ee] [kK] 'filename
You can match a number (for example, vivek1 or Vivek2 ):
Grep-W' [vV] ivek [0-9] 'filename
You can match two digits (for example, foo11 and foo12 ):
Grep 'foo [0-9] [0-9] 'filename'
It's not just a number. You can match letters:
Grep '[A-Za-z]' FILENAME
Display All text lines containing letters w or n:
Grep [wn] FILENAME
In the expressions in parentheses, the name of the character class included in "[:" and ":]" indicates a list Of all characters belonging to the class. Standard character class name:
[: Alnum:]-alphanumeric characters.
[: Alpha:]-alphabetic order
[: Blank:]-space and tab.
[: Digit:]-number: '0 1 2 3 4 5 6 7 8 9 '.
[: Lower:]-lowercase letter: 'a B c d e F '.
[: Space:]-special characters: tab, line break, vertical tab, form feed, carriage return, and space.
[: Upper:]-uppercase letter: 'a B c d e f g h I J K L M N O P Q R S T U V W X Y Z '.
In the following example, all uppercase letters are matched:
Grep '[: upper:]' FILENAME
3. How do I use wildcards?
You can use "." To replace a single character. In the following example, all three words starting with the letter "B" and ending with the letter "t" are queried.
Grep '\ <B. t \> 'filename'
In the preceding example,
\ <Match a space string at the beginning of a word
\> Match a space string at the end of a word
Search and output the results of all two letters:
Grep '^ .. $' FILENAME
Search and display all results starting with '.' And numbers:
Grep '^ \. [0-9]' FILENAME
Escape Character '.'
The following regular expression finds the IP address 192.168.1.254 and cannot obtain the expected results:
Grep '1970. 168.1.254 '/etc/hosts
All three points must be escaped:
Grep '2017 \. 192 \. 1 \. 100'/etc/hosts
The following example matches only one address:
Egrep '[[: digit:] {1, 3 }\. [[: digit:] {1, 3 }\. [[: digit:] {1, 3 }\. [[: digit:] {1, 3} 'filename
The words Linux or Unix are matched in case-insensitive mode:
Egrep-I '^ (linux | unix)' FILENAME
2. Explore grep advanced search mode in Depth
1. How to retrieve a pattern that starts?
Use the-e Option to search for all results matching '-test. Grep will try to parse '-test-' as an option:
Grep-e '-- test -- 'filename
2. How to Use the OR logical operation in grep?
Grep-E 'word1 | word2 'FILENAME
### OR ###
Egrep 'word1 | word2 'FILENAME
Or you can do this.
Grep 'word1 \ | word2 'FILENAME
3. How to Use the AND logical operation in grep?
Follow the following syntax to display all results that contain the words 'word1 'and 'word2:
Grep 'word1 'filename | grep 'word2'
Alternatively, you can:
Grep 'foo. * bar \ | word3. * word4' FILENAME
4. How to test the sequence?
You can use the following syntax to test the number of repetitions of a character in the sequence:
{N}
{N ,}
{Min, max}
Match the string containing two letters v:
Egrep "v {2}" FILENAME
In the following example, the search file contains the string results of "col" and "cool:
Egrep 'co {1, 2} l' FILENAME
In the following example, match results with at least three letters c:
Egrep 'C {3,} 'FILENAME
The following example matches the phone number in the format of "91-1234567890" (that is, "two digits-ten digits ")
Grep "[[: digit:] \ {2 \} [-] \? [[: Digit:] \ {10 \} "FILENAME
5. How to highlight grep output results?
Use the syntax of the following example:
Grep -- color regex FILENAME
6. How can I make the grep output show only the matched part instead of the whole line?
Use the syntax of the following example:
Grep-o regex FILENAME
Iii. Regular Expression operator Summary
Regular Expression Operator |
Description |
. |
Match any single character. |
? |
Match the first character 0 or 1 time. |
* |
Match the first character ≥ 0. |
+ |
Match the first character ≥1. |
{N} |
Match the first character N times. |
{N ,} |
Match the first character ≥ m times. |
{N, M} |
Match the first character N to M times. |
- |
If the end point of a list or a range is in the list, it indicates the range. |
^ |
Start tag, indicating that an empty string is matched at the start position. It also indicates characters that are not in the list range. |
$ |
End tag. Matches an empty string. |
\ B |
Word lock. Matches an empty string at the edge of a word. |
\ B |
Matches an empty string at a non-edge position of a word. |
\ < |
Matches an empty string starting with a word. |
\> |
Matches an empty string at the end of a word. |
4. grep and egrep
Egrep is grep-E, which interprets the pattern as an extended regular expression. The grep help document defines this as follows:
In basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \). Traditional egrep did not support the { meta-character, and some egrep implementations support \{ instead, so portable scripts should avoid { in grep -E patterns and should use [{] to match a literal {. GNU grep -E attempts to support traditional usage by assuming that { is not special if it would be the start of an invalid interval specification. For example, the command grep -E '{1' searches for the two-character string {1 instead of reporting a syntax error in the regular expression. POSIX.2 allows this behavior as an extension, but portable scripts should avoid it.
References:
- Grep and regex help document
- Grep info help document
Grep uses concise and Regular Expressions
Regular Expression usage
Assertion with Zero Width of a regular expression
Linux Command-grep for file text operations
Grep Regular Expression
Regular Expressions and file formatting commands in Linux (awk/grep/sed)
Link: http://www.cyberciti.biz/faq/grep-regular-expressions/
Link: http://www.linuxstory.org/grep-regular-expressions/
This article permanently updates the link address: