About grep Regular Expressions in Linux, grep Regular Expressions
Wildcard
* Any character can be repeated multiple times
Any character, repeat once
[] Represents a character
For example, [a, B, c] indicates any one of abc.
Wildcard is used to match the file name
Regular Expression
Regular expressions are used to match matching strings in the file.
Ls find cp does not support regular expressions
However, grep awk sed supports regular expressions.
[Root @ Hadoop-bigdata01 test] # touch aa
[Root @ hadoop-bigdata01 test] # touch aab aabb
[Root @ hadoop-bigdata01 test] # ll
Total 0
-Rw-r -- 1 root 0 May 16 :47 aa
-Rw-r -- 1 root 0 May 16 :47 aab
-Rw-r -- 1 root 0 May 16 :47 aabb
[Root @ hadoop-bigdata01 test] # ls aa
Aa
[Root @ hadoop-bigdata01 test] # ls aa
Aab
[Root @ hadoop-bigdata01 test] # ls aa *
Aa aab aabb
Special characters in Regular Expression
Regular Expression matching range
Use Regular Expressions
Grep "1"/etc/passwd
For a row that contains the keyword 1, grep only needs to contain the line. If you do not want a wildcard, it must be completely consistent.
[Root @ hadoop-bigdata01 test] # grep "1"/etc/passwd
Bin: x: 1: 1: bin:/sbin/nologin
Mail: x: 8: 12: mail:/var/spool/mail:/sbin/nologin
Uucp: x: 10: 14: uucp:/var/spool/uucp:/sbin/nologin
Operator: x: 11: 0: operator:/root:/sbin/nologin
Games: x: 12: 100: games:/usr/games:/sbin/nologin
Gopher: x: 13: 30: gopher:/var/gopher:/sbin/nologin
Ftp: x: 14: 50: FTP User:/var/ftp:/sbin/nologin
Messages: x: 81: 81: System message bus: // sbin/nologin
Usbmuxd: x: 113: 113: usbmuxd user: // sbin/nologin
Avahi-autoipd: x: 170: 170: Avahi IPv4LL Stack:/var/lib/avahi-autoipd:/sbin/nologin
Abrt: x: 173: 173:/etc/abrt:/sbin/nologin
Wang: x: 501: 501:/home/wang:/bin/bash
Grep 'root'/etc/passwd
Cat/etc/passwd | grep 'root'
The same is true, but pipeline operators are more resource-consuming.
So
1. Match rows containing numbers
Grep '[0-9]'/etc/passwd
2. Match rows that contain three numbers consecutively
Grep '[0-9] [0-9] [0-9]'/etc/passwd or grep ': [0-9] [0-9] [0-9]: '/etc/passwd
[Root @ hadoop-bigdata01 test] # grep '[0-9] [0-9] [0-9]'/etc/passwd
Games: x: 12: 100: games:/usr/games:/sbin/nologin
Usbmuxd: x: 113: 113: usbmuxd user: // sbin/nologin
Rtkit: x: 499: 497: RealtimeKit:/proc:/sbin/nologin
Avahi-autoipd: x: 170: 170: Avahi IPv4LL Stack:/var/lib/avahi-autoipd:/sbin/nologin
Abrt: x: 173: 173:/etc/abrt:/sbin/nologin
Nfsnobody: x: 65534: 65534: Anonymous NFS User:/var/lib/nfs:/sbin/nologin
Saslauth: x: 498: 76: "Saslauthd user":/var/empty/saslauth:/sbin/nologin
Pulse: x: 497: 496: PulseAudio System Daemon:/var/run/pulse:/sbin/nologin
Liucheng: x: 500: 500:/home/liucheng:/bin/bash
Wang: x: 501: 501:/home/wang:/bin/bas
3. Match rows starting with "r" and ending with "n"
Grep '^ r. * n $'/etc/passwd
. * Indicates all
[Root @ hadoop-bigdata01 test] # grep '^ r. * n $'/etc/passwd
Rpc: x: 32: 32: Rpcbind Daemon:/var/cache/rpcbind:/sbin/nologin
Rtkit: x: 499: 497: RealtimeKit:/proc:/sbin/nologin
Rpcuser: x: 29: 29: RPC Service User:/var/lib/nfs:/sbin/nologin
4. Filter ifconfig and intercept ip addresses
Grep-v indicates reverse truncation, which means to remove the line sed with a keyword.
[Root @ hadoop-bigdata01 test] # ifconfig | grep 'inet addr :'
Inet addr: 192.168.126.191 Bcast: 192.168.126.255 Mask: 255.255.255.0
Inet addr: 127.0.0.1 Mask: 255.0.0.0
[Root @ hadoop-bigdata01 test] #
[Root @ hadoop-bigdata01 test] # ifconfig | grep 'inet addr: '| grep-v '2017. 0.0.1'
Inet addr: 192.168.126.191 Bcast: 192.168.126.255 Mask: 255.255.255.0
[Root @ hadoop-bigdata01 test] # ifconfig | grep 'inet addr: '| grep-v '2017. 0.0.1' | sed's/inet addr: // G'
192.168.126.191 Bcast: 192.168.126.255 Mask: 255.255.255.0
[Root @ hadoop-bigdata01 test] # ifconfig | grep 'inet addr: '| grep-V' 127. 0.0.1 '| sed's/inet addr: // G' | sed's/Bcast. * // G'
192.168.126.191
Misunderstanding
There is a misunderstanding here. For a long time, it is the difference between regular expressions and wildcards.
We know that wildcard * refers to any character, and repeated Regular Expressions * refer to matching the previous character> = 0
The two are completely different. How do I know whether * is a wildcard or a regular expression?
At first I got into a misunderstanding. Let's look at the following commands:
[Root @ hadoop-bigdata01 test] # touch ac aac abc abbc
[Root @ hadoop-bigdata01 test] # ll
Total 0
-Rw-r -- 1 root 0 May 16 :55 aac
-Rw-r -- 1 root 0 May 16 :55 abbc
-Rw-r -- 1 root 0 May 16 19:55 abc
-Rw-r -- 1 root 0 May 16 19:55 ac
[Root @ hadoop-bigdata01 test] # ls | grep 'a * C'
Aac
Abbc
Abc
Ac
[Root @ hadoop-bigdata01 test] # ls | grep 'a. * C'
Aac
Abbc
Abc
Ac
[Root @ hadoop-bigdata01 test] # ls | grep '^ a. * C'
Aac
Abbc
Abc
Ac
[Root @ hadoop-bigdata01 test] # ls | grep '^ a * C'
Aac
Ac
Why is the result of grep 'a * C' and grep '^ a * c $' different? I think one is a wildcard and the other is a regular expression, because a * c displays four results, exactly
Isn't it a match for any number of characters?
Otherwise
Wildcard is used to match the file name
Regular expressions are used to match matching strings in the file.
After being handed over to the pipeline operator, grep is no longer used to match the file name. This is an operation on the file, so it is completely a regular expression.
Grep 'a * C' indicates matching a> = 0, so it is acceptable if c is contained.
Grep '^ a * c $' is also a regular expression that begins with a, and the second character matches a zero or multiple times, followed by c letters
Therefore, only aac and ac meet the conditions.
So let's look at this example.
[Root @ hadoop-bigdata01 test] # ls
A aac abb abbc abc ac B bb c cb
[Root @ hadoop-bigdata01 test] # ls | grep 'a * B'
Abb
Abbc
Abc
B
Bb
Cb
Here, grep 'a * B 'means that a does not contain a and B, but a repeats 0 times or multiple times and then contains B <