Linux Text Processing Three musketeers:
grep: Text Filtering tool
Sed: Text editor
awk: Text Report Generator, awk implementation on Linux Gawk
Grep:
Function: Text Search tool, according to user-specified "pattern" line to search for target text, print the matching line;
Pattern: The filter condition written by metacharacters and text characters of regular expressions;
Metacharacters: A character does not represent its literal meaning, but is used to denote a wildcard or control function;
Divided into two categories:
Basic Regular Expressions: BRE
Extended Regular expression: ERE
grep [OPTIONS] PATTERN [FILE ...] :
Options:
--color=auto: The matching to the string to do highlighting;
-V: The display mode does not match the row;
-I: ignore character case;
-O: Displays only strings that can be matched to the pattern;
-Q: Silent mode;
-C: Print rows to match
-L; Find rows that do not contain matches
-N: Prints the line containing the match
-E: Using extended regular expressions
The metacharacters of the basic regular expression:
Character Matching:
.: matches any single character;
[]: matches any single character within the specified range;
[^]: matches any single character within the specified range;
[: Alnum:] representing English uppercase and lowercase characters and numbers, i.e. 0-9, A-Z, A-Z
[: Alpha:] on behalf of any English uppercase and lowercase characters, i.e. A-Z, A-Z
[: Upper:] On behalf of uppercase, i.e. A-Z
[: Lower:] for lowercase letters, i.e. A-Z
[^] Outside of the match range
Number of occurrences: used to specify the number of times after the character;
*: any time;
\?:0 or 1 times;
grep "X\?y"
\+:1 or multiple times;
\{m\}: Precisely limited to M times;
\{m,n\}: At least m times, up to n times, [M,n]
\{0,n\}: Up to n times;
\{m,\}: at least m times;
. *: Matches any character of any length;
Location anchoring:
^: Anchor at the beginning of the line, for the leftmost mode;
$: End of line anchoring; for the rightmost side of the pattern;
\< \b: The first anchor of the word; the left side of the pattern used to denote the word;
\> \b: the ending anchor; the right side of the pattern used to represent the word;
^$: blank line;
Group: \ (\)
\ (pattern\) "\ (ab\) *c"
Back reference: The contents of the pattern in the grouped parentheses are recorded by the regular expression engine during execution, and the built-in variables are saved (\1,\2 、、、、 \ n) in the built-in variable can be referenced.
\1: For the referenced content, from left to right in the pattern, with the first opening parenthesis and the contents of the right parenthesis corresponding to it
\2: For the referenced content, the pattern is from left to right, with the second opening parenthesis and the contents of the right parenthesis corresponding to it
Back reference: Use a variable to refer to the character that matches the pattern in the preceding grouping brackets;
An extended regular expression:
The grep family has three commands:
grep: basic Regular expression
-e: extending regular expressions
-F: Regular expressions are not supported
Egrep: Extending Regular expressions
Fgrep: Regular expressions are not supported
Extend the metacharacters of regular expressions:
Character Matching:
.: Any single character
[]:
[^]: outside the range
Number of matches:
*: Any time
?: 0 or 1 times;
+: more than 1 times;
{m}: exact match m times;
{M,n}: At least m times, up to n times;
Anchoring:
^: Anchoring the beginning of the line
$: Anchor Line End
\< \b: with regular regular expressions
\>, \b
Group: ()
Back reference: \1, \2, ...
Or:
A|b
C|cat: does not represent cat or cat, but is C or cat;
To be written (C|c) at
Practice:
1. display the lines in the/etc/passwd file that end with bash
[[email protected] tmp]# grep "bash$"/etc/passwdroot:x:0:0:root:/root:/bin/bashamandabackup:x:33:6:amanda User:/var /lib/amanda:/bin/bashpostgres:x:26:26:postgresql Server:/var/lib/pgsql:/bin/bashcanshan:x:500:500:canshan:/home /canshan:/bin/bashoralce:x:3001:3001::/home/nicai:/bin/bashduanshui:x:3002:3002::/home/duanshui:/bin/ Bashcentos:x:3003:3003::/home/centos:/bin/bashuser1:x:3004:3004::/home/user1:/bin/bash
2. display the two-digit or three-digit number in the/etc/passwd file
[[email protected] tmp]# grep "[[:d igit:]]\{2,3\}" passwd mail:x:8:12:mail:/var/ spool/mail:/sbin/nologinuucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologinoperator:x:11:0:operator:/root:/sbin/ Nologingames:x:12:100:games:/usr/games:/sbin/nologingopher:x:13:30:gopher:/var/gopher:/sbin/nologinftp:x:14:50 : Ftp user:/var/ftp:/sbin/nologinnobody:x:99:99:nobody:/:/sbin/nologindbus:x:81:81:system message bus:/:/sbin/nologinusbmuxd:x:113:113:usbmuxd user:/:/sbin/nologinrpc:x:32:32:rpcbind daemon:/ var/cache/rpcbind:/sbin/nologinabrt:x:173:173::/etc/abrt:/sbin/nologinvcsa:x:69:69:virtual console memory owner:/dev:/sbin/nologinrtkit:x:499:497:realtimekit:/proc:/sbin/nologinavahi-autoipd:x:170:170: avahi ipv4ll stack:/var/lib/avahi-autoipd:/sbin/nologinapache:x:48:48:apache:/var/www:/sbin/ Nologinsaslauth:x:498:76:saslauthd user:/var/empty/saslauth:/sbin/nologinrpcuser:x:29:29:rpc service user:/var/lib/nfs:/sbin/nologinnfsnobody:x:65534:65534:anonymous nfs user:/var/lib/nfs:/sbin/nologinpostfix:x:89:89::/var/ Spool/postfix:/sbin/nologinricci:x:140:140:ricci daemon user:/var/lib/ricci:/sbin/nologinhaldaemon:x : 68:68:hal daemon:/:/sbin/nologingdm:x:42:42::/var/lib/gdm:/sbin/nologinntp:x:38:38::/etc/ntp:/sbin/ nologintomcat:x:91:91:apache tomcat:/usr/share/tomcat6:/sbin/nologinmemcached:x:497:495:memcached Daemon:/var/run/memcached:/sbin/nologinamandabackup:x:33:6:amanda user:/var/lib/amanda:/bin/bashpulse:x : 496:494:pulseaudio system daemon:/var/run/pulse:/sbin/nologinpiranha:x:60:60::/etc/sysconfig/ha:/ Sbin/nologinsshd:x:74:74:privilege-separated ssh:/var/empty/sshd:/sbin/nologinpostgres:x:26:26:postgresql Server:/var/lib/pgsql:/bin/bashluci:x:141:141:luci high availability management application:/var/lib/luci:/sbin/nologindovecot:x:97:97:dovecot imap server:/usr/libexec/dovecot:/ sbin/nologindovenull:x:495:491:dovecot ' s unauthorized user:/usr/libexec/dovecot:/sbin/nologintcpdump:x:72:72::/:/sbin/ Nologincanshan:x:500:500:canshan:/home/canshan:/bin/bashoralce:x:3001:3001::/home/nicai:/bin/bashduanshui:x : 3002:3002::/home/duanshui:/bin/bashcentos:x:3003:3003::/home/centos:/bin/bashuser1:x:3004:3004::/home/user1:/ Bin/bash
3, line
that displays the result of ' Netstat-tan ' command ending with ' LISTEN ' followed by 0, one, or more whitespace characters
[[email protected] tmp]# netstat -tan |grep -e ' LISTEN[[:space:]]*$ ' TCP 0 0 0.0.0.0:40355 0.0.0.0:* listen tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:631 0.0.0.0:* listen tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:6010 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:6011 0.0.0.0:* LISTEN tcp 0 0 :::111 :::* listen tcp 0 0 :::22 :: :* LISTEN tcp 0 0 :::44054 :::* LISTEN tcp 0 0 ::1:631 :::* LISTEN tcp 0 0 ::1:25 :::* LISTEN tcp 0 0 ::1:6010 :::* LISTEN tcp 0 0 ::1:6011 :::* listen
4,
[[email protected] tmp]# useradd bash[[email protected] tmp]# useradd testbash[[email protected] tmp]# useradd Basher[[ema Il protected] tmp]# useradd-s/sbin/nologin nologin[[email protected] tmp]# grep "^\ (\<[[:alnum:]].*\>\). *\1$" Pas swdsync:x:5:0:sync:/sbin:/bin/syncshutdown:x:6:0:shutdown:/sbin:/sbin/shutdownhalt:x:7:0:halt:/sbin:/sbin/ Haltbash:x:3005:3005::/home/bash:/bin/bashnologin:x:3008:3008::/home/nologin:/sbin/nologin
5. Display the default shell and UID of root, CentOS, or User1 user on the current system (please create these users beforehand, if not present)
Note: I do not understand why to use Egrep
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/72/86/wKiom1XlZpXTUCOxAADaqQiP-5E976.jpg "title=" 2-1. JPG "alt=" wkiom1xlzpxtucoxaadaqqip-5e976.jpg "/>
6, find a word in the/etc/rc.d/init.d/functions file (the middle of the word can be underlined) followed by a set of parentheses line
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/72/86/wKiom1XlZhOw826UAANyM58zbfA780.jpg "title=" 2-2. JPG "alt=" wkiom1xlzhow826uaanym58zbfa780.jpg "/> Note: I'm not sure if this topic needs to be anchored at the beginning of the line.
7, use echo to output a path, and then egrep find its path base name; Further use Egrep to remove its directory \
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/72/83/wKioL1XlcB-AbFAFAADKJhi4-pw173.jpg "title=" 2-3. JPG "alt=" wkiol1xlcb-abfafaadkjhi4-pw173.jpg "/>
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/72/87/wKiom1XlbzrxfySDAACQYhNr6WY056.jpg "title=" 2-4. JPG "alt=" wkiom1xlbzrxfysdaacqyhnr6wy056.jpg "/>
Note: This is not too clear, there are references to other classmates
10. Find the number between 1-255 in the result of ifconfig command execution
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/72/84/wKioL1Xldo3Ry2M9AAHPPpwXnso654.jpg "title=" 2-5. JPG "alt=" wkiol1xldo3ry2m9aahpppwxnso654.jpg "/>
grep and regular expressions