Introduction
Awk is a powerful text analysis tool, with the search for grep and the editing of SED, which is especially powerful when it comes to analyzing data and generating reports. To put it simply, awk reads the file line-by-row, using spaces as the default delimiter to slice each row, and then perform various analytical processing of the cut.
AWK has 3 different versions: AWK, Nawk, and gawk, which are not specifically described, generally referred to as gawk, andGawk is the GNU version of awk.
Awk has its name from the first letter of its founder Alfred Aho, Peter Weinberger and Brian Kernighan's surname. In fact, Awk does have its own language: The awk programming language, a three-bit creator has formally defined it as "style scanning and processing language." It allows you to create short programs that read input files, sort data, manipulate data, perform calculations on input, and generate reports, as well as countless other features.
Awk is very good, efficient, and simple code, the format of the text processing ability is very strong. Basically grep and SED are able to do all the work awk can do and do better.
How to use
awk ' {pattern + action} ' {filenames}
Although the operation can be complex, the syntax is always the same, where pattern represents what AWK looks for in the data, and the action is a series of commands that are executed when a match is found. Curly braces ({}) do not need to always appear in the program, but they are used to group a series of instructions according to a particular pattern. pattern is the regular expression to be represented, surrounded by slashes.
The most basic function of the awk language is to browse and extract information in a file or string based on the specified rules, before awk extracts the information for additional text operations. A complete awk script is typically used to format the information in a text file.
Typically, awk is treated as a unit of a file's behavior. awk processes the text by executing the corresponding command for each line that receives the file.
Getting Started instance
Suppose the output of Last-n 5 is as follows
[[email protected] ~]# last -n 5 <== only remove the first five elements root pts/1 192.168.1.100 tue feb 10 11:21 still Logged inroot pts/1 192.168.1.100 tue feb 10 00:46 - 02:28 (01:41) root pts/1 192.168.1.100 Mon Feb 9 11:41 - 18:30 (06:48) Dmtsai pts/1 192.168.1.100 mon feb 9 11:41 - 11:41 (00:00) root tty1 fri sep 5 14:09 - 14:10 (00:01)
If you only show the 5 most recently logged-in accounts
-N |
The awk workflow is this: reads a record with a ' \ n ' line break, then divides the record by the specified domain delimiter, fills the field, and $ $represents all fields, representing the first field, $n representing the nth field. The Default Domain delimiter is the "blank key" or "[tab] key", so the login user, $ $ means the login user IP, and so on.
If you just show/etc/passwd's account
#cat/etc/passwd |awk-f ': ' {print $} ' Rootdaemonbinsys
This is an example of awk+action, where each line executes action{print $.
-f Specifies the domain delimiter as ': '.
If you only display the/ETC/PASSWD account and the shell of the account, and the account and the shell are split by tab
#cat/etc/passwd |awk-f ': ' {print $ \ t ' $7} ' root/bin/bashdaemon/bin/shbin/bin/shsys/bin/sh
If you just show/etc/passwd's account and the shell of the account, and the account is separated by a comma from the shell, and the column name Name,shell is added to all rows, add "Blue,/bin/nosh" to the last line.
650) this.width=650; "src=" Http://common.cnblogs.com/images/copycode.gif "alt=" Copy Code "style=" Border:none; "/>
CAT/ETC/PASSWD |awk-f ': ' BEGIN {print ' Name,shell '} {print $ ', ' $7} END {print ' Blue,/bin/nosh '} ' Name,shellroot,/bi N/bashdaemon,/bin/shbin,/bin/shsys,/bin/sh....blue,/bin/nosh
650) this.width=650; "src=" Http://common.cnblogs.com/images/copycode.gif "alt=" Copy Code "style=" Border:none; "/>
The awk workflow is done by first executing the beging, then reading the file, reading a record with the/n line break, and then dividing the record by the specified field delimiter, populating the field, and $ $representing all fields, representing the first field, $n representing the nth field, The action action corresponding to the execution pattern is then started. Then start reading the second record ... Until all the records have been read, the end operation is performed.
Search all rows with the root keyword/etc/passwd
#awk-F: '/root/'/etc/passwdroot:x:0:0:root:/root:/bin/bash
This is an example of the use of pattern, which matches the line of pattern (this is root) to execute the action (without specifying an action, the default output of the contents of each row).
Search support for the regular, for example, root start: awk-f: '/^root/'/etc/passwd
Search all lines that have the root keyword/etc/passwd and display the corresponding shell
# awk-f: '/root/{print $7} '/etc/passwd/bin/bash
Action{print $7} is specified here.
Log statistics
Find the IP call address-to-multiple IP in the Tomcat access log, top 100 in descending order
Cat access_log.txt |grep '/loan/show-loan-detial-loanid-4948914564 ' |awk ' {a[$1]+=1;} End{for (i in a) {print a[i] "" I;}} ' | Sort-nr |awk ' {print $} ' |head-100
Number of occurrences of the same IP in the first column of the statistics file
Cat Test
123.122.123.12 12121212
121.2332.121.11 232323
255.255.255.255 21321
123.122.123.12 12121212
123.122.123.12 1212121er2
123.122.123.12 12121212eer
123.122.123.12 12121212ere
255.255.255.255 21321
121.2332.121.11 232323
255.255.255.255 21321
Command
awk ' {name[$1]++}; END {for (count in name) print Count,name[count]} ' Test|sort
Output:
121.2332.121.11 2
123.122.123.12 5
255.255.255.255 3
Sort by column two in descending order
awk ' {name[$1]++}; END {for (count in name) print Count,name[count]} ' test|sort-k 2-rn
Output:
123.122.123.12 5
255.255.255.255 3
121.2332.121.11 2
Note:-K is a sort key column
-R for Descending sort
-N is sorted by arithmetic value in the Logarithmic field. Numeric fields can contain leading spaces, optional minus signs, decimal digits, thousand-bit separators, and optional cardinal characters. Numeric sorting of fields that contain any non-numeric characters can result in unpredictable results.
Also available
awk ' {print '} ' test|sort|uniq-c
Output:
2 121.2332.121.11
5 123.122.123.12
3 255.255.255.255
If you want IP in front
awk ' {print '} ' Test|sort|uniq-c|awk ' {print $2,$1} '
Output:
121.2332.121.11 2
123.122.123.12 5
255.255.255.255 3
When the following format
14:09:47,812directgetserveripshdjshd.mp4| from:123.111.176.187| ipid:6| serviceid:6| roomid:6| Type:1|return:1
Count The number of occurrences of the same IP
Awk-f "[: |]" ' {name[$5]++}; END {for (count in name) print Count,name[count]} ' testd |sort-k 2-rn
Output:
123.233.176.133 2
111.234.136.134 2
123.111.176.183 1
awk defines multiple separators
awk command-line option-F "[: |]" tells AWK | And: Are field separators
[email protected] bin]# cat Te
Weblogic:x:502:600:home/weblogic:bin/bash
[email protected] bin]# Cat Te | Awk-f "[: |]" ' {Print $7} '
Bin
This article is from the "Good Big Knife" blog, please make sure to keep this source http://53cto.blog.51cto.com/9899631/1758618
Awk Daily Instances