Detailed description of Linux-based regular expressions (basic regular expressions and extended regular expression commands use instances) and linux Regular Expressions
Preface
Regular Expressions are widely used and can be used perfectly in most programming languages. They are also of great use in Linux.
You can use regular expressions to effectively filter out the required text and then use the corresponding Supported tools or languages to meet the task requirements.
In this blog, we use grep/egrep to call regular expressions. In fact, we can also use tools such as sed. However, the use of sed requires many regular expressions, in order to write the next sed article, we can only sort it in this way. If you need it, you can read the two articles together.
Regular Expression type
Regular Expressions can be implemented using the regular expression engine, which is the basic software for interpreting Regular Expression Patterns and matching texts using these patterns.
In Linux, common regular expressions include:
-POSIX basic regular expression (BRE) Engine
-POSIX extended regular expression (BRE) Engine
Basic use of basic Regular Expressions
Environment text preparation
[root@service99 ~]# mkdir /opt/regular[root@service99 ~]# cd /opt/regular[root@service99 regular]# pwd/opt/regular[root@service99 regular]# cp /etc/passwd temp_passwd
Plain text
Plain text can completely match the corresponding words. Note that the regular expression mode is case sensitive.
// Grep -- color is used to highlight the matched text, so that you can easily observe the effect [root @ service99 regular] # grep -- color "root" temp_passwd root: x: 0: 0: root:/bin/bashoperator: x: 11: 0: operator:/root:/sbin/nologin
In a regular expression, you do not need to limit it to a complete word. The defined text appears anywhere in the data stream, and the regular expression will match.
[root@service99 regular]# ifconfig eth1 | grep --color "add"eth1 Link encap:Ethernet HWaddr 54:52:01:01:99:02 inet addr:192.168.2.99 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::5652:1ff:fe01:9902/64 Scope:Link
Of course, it does not have to be limited to individual words, or spaces and numbers can appear in text strings.
[root@service99 regular]# echo "This is line number 1" | grep --color "ber 1"This is line number 1
Special characters
Note one problem when using text strings in regular expression mode.
There are several exceptions when defining text strings in regular expressions. Regular Expressions give them special meanings. If these special characters are used in the text, they may not be as expected.
Special characters recognized by regular expressions:
Copy codeThe Code is as follows:
. * [] ^ $ {} +? | ()
If you want to use these special characters as common text characters, you need to escape them, that is, add a special character before the character, to the regular expression engine: it should interpret the next character as a normal text character.
The special character used to implement this function is the backslash (\).
[Root @ service99 regular] # echo "This cat is $4.99" // double quotation marks do not block special characters, so the system will read the value of variable 4.99. However, This variable is not currently available, this cat is empty. 99 [root @ service99 regular] # echo "This cat is \ $4.99" // use "\" Escape $ This cat is $4.99 [root @ service99 regular] # echo 'this cat is \ $4.99 '// block metacharacters in single quotes $ This cat is \ $4.99 [root @ service99 regular] # echo 'this cat is $4.99' This cat is $4.99 [root @ service99 regular] # cat price.txt This price is $4.9. 9 hello, world! $5.00 # $ This is "\". [root @ service99 regular] # grep -- color '\' price.txt This is "\".
Operator
Start from scratch
The Escape Character (^) refers to the pattern starting from the beginning of the Chinese line of the data stream.
[Root @ service99 regular] # grep -- color '^ H' price.txt // The row starting with the letter h: hello, world! [Root @ service99 regular] # grep -- color '^ $' price.txt // No output result, [root @ service99 regular] # grep -- color '^ \ $' price.txt // line starting with $ [root @ service99 regular] # echo" this is ^ test. "> price.txt [root @ service99 regular] # cat price.txt This price is $4.99 hello, world! $5.00 # $ This is "\". this is ^ test. [root @ service99 regular] # grep -- color '^' price.txt // directly use This price is $4.99 hello, world! $5.00 # $ This is "\". this is ^ test. [root @ service99 regular] # grep -- color '\ ^' price.txt // for separate use, and block This is ^ test at the beginning. [root @ service99 regular] # grep -- color 'is ^' price.txt // when the symbol is not at the beginning, you can directly use This is ^ test without blocking.
Search end
Dollar sign $ Special Character defines ending position. After the text mode, add this special character to indicate that the data row must end in this text mode.
[Root @ service99 regular] # grep -- color '\. $ 'price.txt // ". "It also has special meanings in regular expressions. Please block them. For details, refer to This is "\". [root @ service99 regular] # grep -- color '\. $ 'price.txt // because I added a space when entering the file, you need to be careful and be careful about This is ^ test. // In a regular expression, spaces are used as the delimiter. [Root @ service99 regular] # grep -- color '0 $ 'price.txt $5.00 [root @ service99 regular] # grep -- color '9 $' price.txt This price is $4.99
Joint Positioning
"^ $" Is commonly used to indicate empty rows.
Combined with "^ #", because # represents a comment in Linux
Output valid configurations of the text
[Root @ service99 regular] # cat-n/etc/vsftpd. conf | wc-l121 [root @ service99 regular] # grep-vE '^ # | ^ $'/etc/vsftpd. conf // v indicates reverse selection, and E indicates that the extended regular "|" indicates the extended regular symbol, the following code displays response = YESlocal_enable = YESwrite_enable = YESlocal_umask = affinity = YESanon_umask = affinity = YESxferlog_enable = affinity = YESlisten = YESpam_service_name = response = YES
Character range
{N, m} // the first character appears n to m times
{N ,}// the previous character appears more than n times
{N} // The previous character appears n times
[root@service99 regular]# grep --color "12345\{0,1\}" price.txt 1234556[root@service99 regular]# grep --color "12345\{0,2\}" price.txt 1234556
Point character
The dot special character is used to match any single character except the line break, but the dot character must match one character. If there is no character at the dot position, the pattern match fails.
[root@service99 regular]# grep --color ".s" price.txt This price is $4.99This is "\".This is ^ test. [root@service99 regular]# grep --color ".or" price.txt hello,world!
Character class
A character class can define a type of character to match a position in text mode. If a character in the character class is in the data stream, it matches the pattern.
Square brackets must be used to define character classes. All characters in the class should be enclosed in square brackets, and the entire character class should be used in the mode, just like any other wildcard.
[Root @ service99 regular] # grep -- color "[abcdsxyz]" price.txt This price is $4.99 hello, world! This is "\". this is ^ test. [root @ service99 regular] # grep -- color "[sxyz]" price.txt This price is $4.99 This is "\". this is ^ test. [root @ service99 regular] # grep -- color "[abcd]" price.txt This price is $4.99 hello, world! [Root @ service99 regular] # grep -- color "Th [ais]" price.txt // This price that matches the first character after Th in [ais] is $4.99 This is" \". this is ^ test. [root @ service99 regular] # grep-I -- color "th [ais]" price.txt //-I indicates This price is $4.99 This is "\". this is ^ test.
If you cannot determine the case sensitivity of a character, you can use this mode:
[Root @ service99 regular] # echo "Yes" | grep -- color "[yY] es" [] Character Sequence does not affect Yes [root @ service99 regular] # echo "yes" | grep -- color "[Yy] es" yes
You can use multiple character classes in a single expression:
[Root @ service99 regular] # echo "Yes/no" | grep "[Yy] [Ee]" Yes/no [root @ service99 regular] # echo "Yes/no" | grep "[Yy]. * [Nn] "// * Regular Expression usage. See Yes/no.
Character classes also support numbers:
[root@service99 regular]# echo "My phone number is 123456987" | grep --color "is [1234]"My phone number is 123456987[root@service99 regular]# echo "This is Phone1" | grep --color "e[1234]"This is Phone1[root@service99 regular]# echo "This is Phone1" | grep --color "[1]"This is Phone1
The character class also has a very common purpose of parsing words that may be spelled incorrectly:
[root@service99 regular]# echo "regular" | grep --color "r[ea]g[ua]l[ao]"regular
Negative character class
Used to search for characters not in the character class. You only need to add the Escape Character (^) at the beginning of the character class range ).
Even if no character is used, the character class must still match one character.
[root@service99 regular]# cat price.txt This price is $4.99hello,world!$5.00#$#$This is "\".this is ^ test. catcar[root@service99 regular]# sed -n '/[^t]his/p' price.txt This price is $4.99This is "\".[root@service99 regular]# grep --color "[^t]his" price.txt This price is $4.99This is "\".[root@service99 regular]# grep --color "ca[tr]" price.txt catcar[root@service99 regular]# grep --color "ca[^r]" price.txt cat
Scope of use
When you need to match a large number of characters and have certain rules, you can do this:
[Root @ service99 regular] # cat price.txt This price is $4.99 hello, world! $5.00 # $ This is "\". this is ^ test. catcar123454251111806 [root @ service99 regular] # egrep -- color '[a-z]' price.txt This price is $4.99 hello, world! This is "\". this is ^ test. catcar [root @ service99 regular] # egrep -- color '[A-Z]' price.txt This price is $4.99 This is "\". [root @ service99 regular] # grep -- color "[0-9]" price.txt This price is $4.99 $5.00123455691111806 [root @ service99 regular] # sed-n'/^ [^ a-Z]/P' price.txt $5.00 #$123455691111806 [root @ service99 regular] # grep -- color "^ [^ a-Z]" price.txt $5.00 # $ #$123455691111806 [root @ service99 regular] # echo $ LANG // when using [a-Z, pay attention to the LANG environment variable value, if this value is modified, pay attention to the legitimacy of the modified value zh_CN.UTF-8 [root @ service99 regular] # LANG = en_US.UTF-8
Special character class
It is used to match characters of a specific type.
[[: Blank:] space and positioning (tab) characters
[[: Cntrl:] control characters
[[: Graph:] non-space (nonspace) characters
[[: Space:] All blank characters
[[: Print:] printable characters
[[: Xdigit:] hexadecimal number
[[: Punct:] All punctuation marks
[[: Lower:] lowercase letters
[[: Upper:] uppercase letters
[[: Alpha:] uppercase and lowercase letters
[[: Digit:] Number
[[: Alnum:] numbers and uppercase/lowercase letters
Asterisk
Add an asterisk after a character to indicate that the character does not appear or appears multiple times in the matching text.
[Root @ service99 regular] # cat test.info goolego gocome ongoooooooooo [root @ service99 regular] # grep -- color "o *" test.info goolego gocome regular [root @ service99 regular] # grep -- color "go *" test.info goolego go gogoooooooooo [root @ service99 regular] # grep -- color "w. * d "price.txt // often corresponds. use hello, world together!
Extended Regular Expression
Question mark
The question mark indicates that the previous character may not appear or appear once. Does not match repeated characters.
[root@service99 regular]# egrep --color "91?" price.txt This price is $4.99911
Plus sign
The plus sign indicates that the preceding character can appear once or multiple times, but must appear at least once. If the character does not exist, the mode does not match.
[root@service99 regular]# egrep --color "9+" price.txt This price is $4.99911[root@service99 regular]# egrep --color "1+" price.txt 123455691111806
Use braces
Use braces to specify the limit on repeated regular expressions, which is usually called an interval.
-M: The regular expression appears exactly m times.
-M, n: the regular expression appears at least m times, at most n times
[root@service99 regular]# echo "This is test,test is file." | egrep --color "test{0,1}"This is test,test is file.[root@service99 regular]# echo "This is test,test is file." | egrep --color "is{1,2}"This is test,test is file.
Regular Expression instance
Here is an example of the basic regular expression exercises and examples.
Because of regular expressions, the single-view concept or theory is still relatively simple, but in actual use, it is not so easy to use. Once used, the efficiency improvement is absolutely considerable.
1. filter the keyword contained in the downloaded file
grep --color "the" regular_express.txt
2. Filter download files that contain the keyword
grep --color -vn "the" regular_express.txt
3. filter the keyword in the downloaded file.
grep --color -in "the" regular_express.txt
4. filter the two words test or taste.
grep --color -En 'test|taste' regular_express.txt grep --color -i "t[ae]ste\{0,1\}" 1.txt
5. Filter byte with oo
grep --color "oo" regular_express.txt
6. filter the products with g in front of oo
grep --color [^g]"oo" regular_express.txt grep --color "[^g]oo" regular_express.txt
7. pre-filter oo with lower-case characters
egrep --color "[^a-z]oo" regular_express.txt
8. filter the row with digits
egrep --color [0-9] regular_express.txt
9. filter
egrep --color ^the regular_express.txt
10. filter the characters starting with lowercase letters
egrep --color ^[a-z] regular_express.txt
11. The filter starts with an English letter.
egrep --color ^[^a-Z] regular_express.txt
12. The end of the row is the decimal point.
egrep --color $"\." regular_express.txt
13. Filter blank rows
egrep --color "^$" regular_express.txt
14. filter out g ?? String of d
egrep --color "g..d" regular_express.txt
15. Filter strings with at least two o Levels
egrep --color "ooo*" regular_express.txt egrep --color o\{2,\} regular_express.txt
16. filter the beginning and end of g, but there is only one o between two g
egrep --color go\{1,\}g regular_express.txt
17. Filter rows of any number
egrep --color [0-9] regular_express.txt
18. Filter two o strings
egrep --color "oo" regular_express.txt
19. filter 2 to 5 o After g, and then connect a string of g
egrep --color go\{2,5\}g regular_express.txt
20. Filter more than two o's after g
egrep --color go\{2,\} regular_express.txt
The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.