Awk Daily Instances

Source: Internet
Author: User


Awk is a powerful text analysis tool, with the search for grep and the editing of SED, which is especially powerful when it comes to analyzing data and generating reports. To put it simply, awk reads the file line-by-row, using spaces as the default delimiter to slice each row, and then perform various analytical processing of the cut.

AWK has 3 different versions: AWK, Nawk, and gawk, which are not specifically described, generally referred to as gawk, andGawk is the GNU version of awk.

Awk has its name from the first letter of its founder Alfred Aho, Peter Weinberger and Brian Kernighan's surname. In fact, Awk does have its own language: The awk programming language, a three-bit creator has formally defined it as "style scanning and processing language." It allows you to create short programs that read input files, sort data, manipulate data, perform calculations on input, and generate reports, as well as countless other features.

Awk is very good, efficient, and simple code, the format of the text processing ability is very strong. Basically grep and SED are able to do all the work awk can do and do better.

How to use
awk ' {pattern + action} ' {filenames}

Although the operation can be complex, the syntax is always the same, where pattern represents what AWK looks for in the data, and the action is a series of commands that are executed when a match is found. Curly braces ({}) do not need to always appear in the program, but they are used to group a series of instructions according to a particular pattern. pattern is the regular expression to be represented, surrounded by slashes.

The most basic function of the awk language is to browse and extract information in a file or string based on the specified rules, before awk extracts the information for additional text operations. A complete awk script is typically used to format the information in a text file.

Typically, awk is treated as a unit of a file's behavior. awk processes the text by executing the corresponding command for each line that receives the file.

Getting Started instance

Suppose the output of Last-n 5 is as follows

[[email protected] ~]# last -n 5 <== only remove the first five elements root      pts/1  tue feb 10 11:21   still  Logged inroot     pts/1  tue feb  10 00:46 - 02:28   (01:41) root     pts/1  Mon Feb  9 11:41 - 18:30   (06:48) Dmtsai    pts/1  mon feb  9 11:41 -  11:41   (00:00) root     tty1                    fri sep  5 14:09  - 14:10   (00:01)

If you only show the 5 most recently logged-in accounts

-N |

The awk workflow is this: reads a record with a ' \ n ' line break, then divides the record by the specified domain delimiter, fills the field, and $ $represents all fields, representing the first field, $n representing the nth field. The Default Domain delimiter is the "blank key" or "[tab] key", so the login user, $ $ means the login user IP, and so on.

If you just show/etc/passwd's account

#cat/etc/passwd |awk-f ': ' {print $} ' Rootdaemonbinsys

This is an example of awk+action, where each line executes action{print $.

-f Specifies the domain delimiter as ': '.

If you only display the/ETC/PASSWD account and the shell of the account, and the account and the shell are split by tab

#cat/etc/passwd |awk-f ': ' {print $ \ t ' $7} ' root/bin/bashdaemon/bin/shbin/bin/shsys/bin/sh

If you just show/etc/passwd's account and the shell of the account, and the account is separated by a comma from the shell, and the column name Name,shell is added to all rows, add "Blue,/bin/nosh" to the last line.

650) this.width=650; "src=" Http:// "alt=" Copy Code "style=" Border:none; "/>

CAT/ETC/PASSWD |awk-f ': ' BEGIN {print ' Name,shell '} {print $ ', ' $7} END {print ' Blue,/bin/nosh '} ' Name,shellroot,/bi N/bashdaemon,/bin/shbin,/bin/shsys,/bin/,/bin/nosh

650) this.width=650; "src=" Http:// "alt=" Copy Code "style=" Border:none; "/>

The awk workflow is done by first executing the beging, then reading the file, reading a record with the/n line break, and then dividing the record by the specified field delimiter, populating the field, and $ $representing all fields, representing the first field, $n representing the nth field, The action action corresponding to the execution pattern is then started. Then start reading the second record ... Until all the records have been read, the end operation is performed.

Search all rows with the root keyword/etc/passwd

#awk-F: '/root/'/etc/passwdroot:x:0:0:root:/root:/bin/bash

This is an example of the use of pattern, which matches the line of pattern (this is root) to execute the action (without specifying an action, the default output of the contents of each row).

Search support for the regular, for example, root start: awk-f: '/^root/'/etc/passwd

Search all lines that have the root keyword/etc/passwd and display the corresponding shell

# awk-f: '/root/{print $7} '/etc/passwd/bin/bash

Action{print $7} is specified here.

Log statistics

Find the IP call address-to-multiple IP in the Tomcat access log, top 100 in descending order

Cat access_log.txt |grep '/loan/show-loan-detial-loanid-4948914564 ' |awk ' {a[$1]+=1;} End{for (i in a) {print a[i] "" I;}} ' | Sort-nr |awk ' {print $} ' |head-100

Number of occurrences of the same IP in the first column of the statistics file

Cat Test 12121212

121.2332.121.11 232323 21321 12121212 1212121er2 12121212eer 12121212ere 21321

121.2332.121.11 232323 21321


awk ' {name[$1]++}; END {for (count in name) print Count,name[count]} ' Test|sort


121.2332.121.11 2 5 3

Sort by column two in descending order

awk ' {name[$1]++}; END {for (count in name) print Count,name[count]} ' test|sort-k 2-rn

Output: 5 3

121.2332.121.11 2

Note:-K is a sort key column

-R for Descending sort

-N is sorted by arithmetic value in the Logarithmic field. Numeric fields can contain leading spaces, optional minus signs, decimal digits, thousand-bit separators, and optional cardinal characters. Numeric sorting of fields that contain any non-numeric characters can result in unpredictable results.

Also available

awk ' {print '} ' test|sort|uniq-c


2 121.2332.121.11



If you want IP in front

awk ' {print '} ' Test|sort|uniq-c|awk ' {print $2,$1} '


121.2332.121.11 2 5 3

When the following format

14:09:47,812directgetserveripshdjshd.mp4| from:| ipid:6| serviceid:6| roomid:6| Type:1|return:1

Count The number of occurrences of the same IP

Awk-f "[: |]" ' {name[$5]++}; END {for (count in name) print Count,name[count]} ' testd |sort-k 2-rn

Output: 2 2 1

awk defines multiple separators

awk command-line option-F "[: |]" tells AWK | And: Are field separators
[email protected] bin]# cat Te
[email protected] bin]# Cat Te | Awk-f "[: |]" ' {Print $7} '

This article is from the "Good Big Knife" blog, please make sure to keep this source

Awk Daily Instances

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.