Shell regular expression of the Three Musketeers--awk

Source: Internet
Author: User
Tags gopher

awk command

awk, like SED, is a streaming editor that operates on lines in the document, one line at a line. Awk is more powerful than SED, and it can do what sed can do. The awk tool is actually very complex and has special books to describe its use.


1 awk Command Form

awk [-f|-f|-v] ' begin{}//{command1; Command2} end{} ' file

[-f|-f|-v] Large parameter,-F specify delimiter,-F call script,-v define variable Var=value

' Reference code block '

BEGIN Initializes a block of code, primarily referencing global variables, setting the FS delimiter

Matches a block of code, which can be a string or regular expression

{} command code block, containing one or more commands

Multiple commands are delimited with semicolons.

End end code block, mainly for final calculation or output end summary information


2 Special Variable characters

The whole thing is going forward

The first field of each row

NF Field number Variable

NR record number per row, multi-file record increment

FNR is similar to NR, although multiple file records are not incremented, each file starts from 1

\ t tab

\ n line break

Defining delimiters when FS begin

The record delimiter entered by the RS, which defaults to a newline character (that is, the text is entered as one line)

~ match, not accurate compared to = =

!~ mismatch, inaccurate comparisons

= = equals, must be all equal, exact comparison

! = does not equal, exact comparison

&& Logic and

|| Logical OR

+ matches 1 or more than 1

/[0-9][0-9]+/two or two or more digits

/[0-9][0-9]*/one or more numbers

FileName File name

OFS output field delimiter, default is also a space, you can change to a tab, etc.

The record delimiter for the ORS output, which defaults to a newline character, that is, the processing result is a line of output to the screen

-F ' [: #/] ' defines three separators


3 Example parsing


To intercept a segment in a document

[[email protected] ~]# head-n2/etc/passwd |awk-f ': ' {print '} '
Root
Bin

To explain, the-f option is to specify a delimiter, or a space or tab if no-f is specified. Print is the printed action used to print out a field. $ $ for the first field, and $ $ for the second field, and so on, and the whole row is represented by $ A.

[[email protected] ~]# head-n2 test.txt |awk-f ': ' {print $} '
Rto:x:0:0:/rto:/bin/bash
Operator:x:11:0:operator:/roto:/sbin/nologin

Note The format of awk, followed by the single quotation mark after-F, and then the delimiter inside, the print action to be enclosed in {}, otherwise it will be an error. Print can also print custom content, but the custom content is enclosed in double quotation marks.

[[email protected] ~]# head-n2 test.txt |awk-f ': ' {print $ "#" $ "#" $ $ "#" $4} '
Rto#x#0#0
Operator#x#11#0


Match character or string

[Email protected] ~]# awk '/oo/' test.txt
Operator:x:11:0:operator:/rooto:/sbin/nologin
Roooto:x:0:0:/rooooto:/bin/bash

[Email protected] ~]# awk-f ': ' $ ~/oo/' test.txt

Roooto:x:0:0:/rooooto:/bin/bash

Can make a segment to match, where the ' ~ ' is the meaning of the match

[[email protected] ~]# awk-f ': '/root/{print $1,$3}/test/{print $1,$3} '/etc/passwd
Root 0
Operator 11
Test 511
Test1 512

Awk can also match multiple times, as the previous example matches root, then match test, and it can print only the matched segments.


Conditional operator

[Email protected] ~]# awk-f ': ' $3== "0" '/etc/passwd
Root:x:0:0:root:/root:/bin/bash

Awk can be judged by logical notation, such as ' = = ' is equal to, can also be understood as ' exact match ' In addition, ' >=, ' <, ' <=, '! ', etc., it is noteworthy that, when compared with the number, if the comparison of the number in double quotation marks, Then awk does not think of the numbers, but the characters, which are considered numbers without double quotes.

[Email protected] ~]# awk-f ': ' $3>= '/etc/passwd
Shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
Halt:x:7:0:halt:/sbin:/sbin/halt
Mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
Nobody:x:99:99:nobody:/:/sbin/nologin
Dbus:x:81:81:system message Bus:/:/sbin/nologin
Vcsa:x:69:69:virtual Console Memory Owner:/dev:/sbin/nologin
Haldaemon:x:68:68:hal Daemon:/:/sbin/nologin
Postfix:x:89:89::/var/spool/postfix:/sbin/nologin
Sshd:x:74:74:privilege-separated Ssh:/var/empty/sshd:/sbin/nologin
Tcpdump:x:72:72::/:/sbin/nologin
User11:x:510:502:user11,user11 ' s Office,12345678,123456789:/home/user11:/sbin/nologin
Test:x:511:511::/home/test:/bin/bash
Test1:x:512:511::/home/test1:/bin/bash

In the above example, a line with a UID greater than or equal to 500 is printed, but the result is not our expectation, because awk treats all the numbers as characters.

[Email protected] ~]# awk-f ': ' $7!= '/sbin/nologin '/etc/passwd
Root:x:0:0:root:/root:/bin/bash
Sync:x:5:0:sync:/sbin:/bin/sync
Shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
Halt:x:7:0:halt:/sbin:/sbin/halt
Test:x:511:511::/home/test:/bin/bash
Test1:x:512:511::/home/test1:/bin/bash

! = is a mismatch and can be logically compared between two segments in addition to a logical comparison of the characters for a segment.

[Email protected] ~]# awk-f ': ' $3<$4 '/etc/passwd
Adm:x:3:4:adm:/var/adm:/sbin/nologin
Lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
Mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
Uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
Games:x:12:100:games:/usr/games:/sbin/nologin
Gopher:x:13:30:gopher:/var/gopher:/sbin/nologin
Ftp:x:14:50:ftp User:/var/ftp:/sbin/nologin

In addition, you can use the && | | means "and" and "or".

[[email protected] ~]# awk-f ': ' $3> "5" && $3< "7" '/etc/passwd
shutdown:x:6:0:shutdown:/ Sbin:/sbin/shutdown
vcsa:x:69:69:virtual Console memory Owner:/dev:/sbin/nologin
Haldaemon:x:68:68:hal Daemon:/:/sbin/nologin
User11:x:510:502:user11,user11 ' s Office,12345678,123456789:/home/user11:/sbin/nologin
Test:x:511:511::/home/test:/bin/bash
Test1:x:512:511::/home/test1:/bin/bash

[[email protected] ~]# awk-f ': ' $3> ' 5 ' | | $7== "/bin/bash" '/etc/passwd
Root:x:0:0:root:/root:/bin/bash
Shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
Halt:x:7:0:halt:/sbin:/sbin/halt
Mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
Nobody:x:99:99:nobody:/:/sbin/nologin
Dbus:x:81:81:system message Bus:/:/sbin/nologin
Vcsa:x:69:69:virtual Console Memory Owner:/dev:/sbin/nologin
Haldaemon:x:68:68:hal Daemon:/:/sbin/nologin
Postfix:x:89:89::/var/spool/postfix:/sbin/nologin
Sshd:x:74:74:privilege-separated Ssh:/var/empty/sshd:/sbin/nologin
Tcpdump:x:72:72::/:/sbin/nologin
User11:x:510:502:user11,user11 ' s Office,12345678,123456789:/home/user11:/sbin/nologin
Test:x:511:511::/home/test:/bin/bash
Test1:x:512:511::/home/test1:/bin/bash


Variables commonly used by awk

NF: How many segments are separated by separators

NR: Number of rows

[Email protected] ~]# HEAD-N3/ETC/PASSWD | Awk-f ': ' {print NF} '
7
7
7
[Email protected] ~]# HEAD-N3/ETC/PASSWD | Awk-f ': ' {print $NF} '
/bin/bash
/sbin/nologin
/sbin/nologin

NF is the number of segments, while the $NF is the last segment of the value, and NR is the line number .

[Email protected] ~]# HEAD-N3/ETC/PASSWD | Awk-f ': ' {print NR} '
1
2
3

We can use line numbers as a criterion:

[Email protected] ~]# awk ' nr>20 '/etc/passwd
Postfix:x:89:89::/var/spool/postfix:/sbin/nologin
Abrt:x:173:173::/etc/abrt:/sbin/nologin
Sshd:x:74:74:privilege-separated Ssh:/var/empty/sshd:/sbin/nologin
Tcpdump:x:72:72::/:/sbin/nologin
User11:x:510:502:user11,user11 ' s Office,12345678,123456789:/home/user11:/sbin/nologin
Test:x:511:511::/home/test:/bin/bash
Test1:x:512:511::/home/test1:/bin/bash

can also be used together with segment matching:

[Email protected] ~]# awk-f ': ' nr>20 && ~/ssh/'/etc/passwd
Sshd:x:74:74:privilege-separated Ssh:/var/empty/sshd:/sbin/nologin


Awk can change the value of a segment

[Email protected] ~]# head-n 3/etc/passwd |awk-f ': ' $1= ' root '
Root x 0 0 Root/root/bin/bash
Root x 1 1 bin/bin/sbin/nologin
Root x 2 2 Daemon/sbin/sbin/nologin

Awk can also perform mathematical operations on the values of individual segments

[Email protected] ~]# HEAD-N2/ETC/PASSWD
Root:x:0:0:root:/root:/bin/bash
Bin:x:1:1:bin:/bin:/sbin/nologin
[[email protected] ~]# head-n2/etc/passwd |awk-f ': ' {$7=$3+$4} '
[Email protected] ~]# head-n2/etc/passwd |awk-f ': ' {$7=$3+$4; print $} '
Root x 0 0 root/root 0
Bin x 1 1 bin/bin 2

Of course, you can also calculate the sum of a segment

[Email protected]lhost ~]# awk-f ': ' {(tot=tot+$3)}; END {print tot} '/etc/passwd
2891

Here's the end to notice that all the rows have been executed

[[email protected] ~]# awk-f ': ' {if ($1== "root") print $} '/etc/passwd
Root:x:0:0:root:/root:/bin/bash


Daily application

Application 1

Awk-f: ' {print NF} ' helloworld.sh output file how many fields per line

Awk-f: ' {print $1,$2,$3,$4,$5} ' helloworld.sh output first 5 fields

Awk-f: ' {print $1,$2,$3,$4,$5} ' ofs= ' \ t ' helloworld.sh outputs the first 5 fields and uses tabs to separate the output

Awk-f: ' {print nr,$1,$2,$3,$4,$5} ' ofs= ' \ t ' helloworld.sh tab separates the first 5 fields and prints line numbers

Application 2

Awk-f ' [: #] ' {print NF} ' helloworld.sh specify multiple separators: #, output how many fields per line

Awk-f ' [: #] ' {print $1,$2,$3,$4,$5,$6,$7} ' ofs= ' \ t ' helloworld.sh tab-delimited output multi-field

Application 3

Awk-f ' [: #/] ' {print NF} ' helloworld.sh specifies three separators and outputs the number of fields per row

Awk-f ' [: #/] ' {print $1,$2,$3,$4,$5,$6} ' helloworld.sh tab-delimited output multi-field

Application 4

calculate the size of the normal file in the/home directory, use KB as the unit , int is the meaning of rounding

Ls-l|awk ' begin{sum=0}!/^d/{sum+=$5} end{print "Total size is:", sum/1024, "KB"} '

Ls-l|awk ' begin{sum=0}!/^d/{sum+=$5} end{print "Total size is:", int (sum/1024), "KB"} ' application 5

Statistics NETSTAT-ANP the number of connections with a status of listen and connect

Netstat-anp|awk ' $6~/listen| connected/{sum[$6]++} end{for (i in sum) printf "%-10s%-6s%-3s \ n", I, "", Sum[i]} '

Application 6

What is the total number of normal files for different users in the/home directory?

Ls-l|awk ' Nr!=1 &&!/^d/{sum[$3]++} end{for (i in sum) printf "%-6s%-5s%-3s \ n", I, "", Sum[i]} '

MySQL 199

Root 568

Statistics the size of the normal files of different users in the/home directory

Ls-l|awk ' Nr!=1 &&!/^d/{sum[$3]+=$5} end{for (i in sum) printf "%-6s%-5s%-3s%-2s \ n", I, "", sum[i]/1024/1024, " MB "}"




This article is from the "Practical Linux knowledge and Skills sharing" blog, please be sure to keep this source http://superleedo.blog.51cto.com/12164670/1888014

Shell regular expression of the Three Musketeers--awk

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.