Linux awk Tools Detailed __linux

Source: Internet
Author: User
Tags numeric value
Brief Introduction

Awk is a powerful text analysis tool that is particularly powerful when it comes to analyzing and generating reports on data, compared to grep lookup and sed editing. To put it simply, awk reads the file line by row, using a space as the default delimiter to slice each row, cut the section, and then perform various analytical processing.

AWK has 3 different versions: AWK, Nawk, and gawk, which are not specifically described, and generally refer to Gawk,gawk as the GNU version of awk.

Awk has its name from the first letter of its founder Alfred Aho, Peter Weinberger and Brian Kernighan. In fact Awk does have its own language: The AWK programming language, which the three-bit creator has formally defined as "style scanning and processing language." It allows you to create short programs that read input files, sort data, process data, perform calculations on input, and generate reports, as well as countless other features. How to use

awk ' {pattern +action} ' {filenames}

Although the operation can be complex, the syntax is always the same, where pattern represents what awk looks for in the data, and the action is a series of commands that are executed when the matching content is found. Curly braces ({}) do not need to appear in the program at all times, but they are used to group a series of instructions according to a specific pattern. pattern is the regular expression to be represented, surrounded by slashes.

The most basic function of the awk language is to browse and extract information based on specified rules in a file or string, and awk extracts the information before it can perform other text operations. A complete awk script is typically used to format the information in a text file. Typically, awk handles units as an act of a file. awk processes the text every single line that receives the file, and then executes the appropriate command. Invoke awk

There are three ways of invoking awk.

1, the command line mode

awk [f field-separator] ' commands ' input-file (s)

Where commands is the true awk command, [-f field separator] is optional. Input-file (s) is the file to be processed.

In awk, each item in a file, separated by a domain delimiter, is called a domain. In general, the default field delimiter is a space without naming the-f field separator.

2, Shell script mode

Insert all of the awk commands into a file and make the awk program executable, and then the awk command interpreter is invoked as the first line of the script, once again by typing the script name.

Equivalent to the first line of a shell script: #!/bin/sh

Can be replaced by: #!/bin/awk

3. Insert all of the awk commands into a separate file, and then call.

Awk-f awk-script-file Input-file (s)

Where the-f option loads the awk script in Awk-script-file, Input-file (s) is the same as above. Getting Started example

The examples described below are called primarily by using the command line.

Suppose the output of Last-n 5 is as follows:

[Root@www ~]# last-n 5 <== Only remove the first five elements

Root pts/1 192.168.1.100 Tue Feb 11:21 still in

Root PTS/1 192.168.1.100 Tue Feb 10 00:46-02:28 (01:41)

Root PTS/1 192.168.1.100 Mon Feb 9 11:41-18:30 (06:48)

Dmtsai pts/1 192.168.1.100 Mon Feb 9 11:41-11:41 (00:00)

Root tty1 Fri Sep 5 14:09-14:10 (00:01)


1, if only the latest login to display the 5 accounts:

#last-N 5 | awk ' {print '} '

Root

Root

Root

Dmtsai

Root

The awk workflow is as follows: Read a record with a ' \ n ' newline character split, then divide the record by the specified field delimiter, fill the field, and the $ $ represents all the fields, representing the first field, $n the nth field. The default Domain delimiter is the blank key or the [tab] key, so it represents the Logged-in user, the $ $ means the logged-on user IP, and so on. (the domain is simple to understand and can be seen as the first few columns).


2, if only show the/ETC/PASSWD account

#cat/etc/passwd |awk-f ': ' {print $} '

Root

Daemon

Bin

Sys

This is an example of awk+action, where each row executes action{print $}. -f Specifies the delimiter as ': '.


3, if only show/etc/passwd account and account corresponding shell, and the account and shell between the TAB key segmentation.

#cat/etc/passwd | Awk-f ': ' {print $ \ t ' $} '

Root/bin/bash

Daemon/bin/sh

Bin/bin/sh

Sys/bin/sh


4, if only show/etc/passwd account and account corresponding shell, and the account and shell separated by commas, and in all rows Add column name Name,shell, add "Blue,/bin/nosh" on the last line.

CAT/ETC/PASSWD |awk-f ': ' BEGIN {print ' name, Shell ' {print $ ', ' $} end {print ' Blue,/bin/nosh '} '

Name,shell

Root,/bin/bash

Daemon,/bin/sh

Bin,/bin/sh

Sys,/bin/sh

....

Blue,/bin/nosh

The awk workflow is like this: first executes the beging, then reads the file, reads a record with the/n newline character split, then divides the record by the specified field delimiter, fills the field, and $ represents all fields, the first field, $n the nth field, The action action for the pattern is then started. Then start reading the second record. Until all the records have been read, the end operation is performed.


5, search/etc/passwd have root keyword of all lines

#awk-F: '/root/'/etc/passwd

Root:x:0:0:root:/root:/bin/bash

This is an example of pattern usage, where the line that matches the pattern (here is root) executes the action (no action is specified and the content of each row is output by default).

Search support Regular, for example to start with root: awk-f: '/^root/'/etc/passwd


6, search/etc/passwd have root keyword of all lines, and show the corresponding shell

# awk-f: '/root/{print $} '/etc/passwd

/bin/bash

Action{print $} awk built-in variables are specified here

Awk has a number of built-in variables to set up environment information, which can be changed, and some of the most commonly used variables are given below.

ARGC

Number of command line arguments

Argv

Command line argument arrangement

ENVIRON

Support for the use of system environment variables in queues

FILENAME

The filename that awk browses

FNR

Browse the number of records in a file

Fs

Sets the input field delimiter, which is equivalent to the command line-F option

Nf

Browse the number of fields that are logged

Nr

Number of records read

OFS

Output Domain Separator

ORS

Output Record Separator

Rs

Control Record Separator

In addition, the $ variable refers to the entire record. Represents the first field in the current row, and $ $ represents the second field of the current row ... Analogy

1, Statistics/etc/passwd: file name, line number of each line, the number of columns per row, the corresponding full line of content:

#awk-F ': ' {print ' filename: "filename", linenumber: "NR", Columns: "NF", Linecontent: "$}"/etc/passwd

Filename:/etc/passwd,linenumber:1,columns:7,linecontent:root:x:0:0:root:/root:/bin/bash

Filename:/etc/passwd,linenumber:2,columns:7,linecontent:daemon:x:1:1:daemon:/usr/sbin:/bin/sh

Filename:/etc/passwd,linenumber:3,columns:7,linecontent:bin:x:2:2:bin:/bin:/bin/sh

Filename:/etc/passwd,linenumber:4,columns:7,linecontent:sys:x:3:3:sys:/dev:/bin/sh

2, using printf instead of print, you can make the code more concise, easy to read.

Awk-f ': ' {printf ("filename:%10s,linenumber:%s,columns:%s,linecontent:%s \ n ", filename,nr,nf,$0)} '/etc/passwd print and printf

The functions of print and printf two kinds of printouts are also available in awk.

Where the print function argument can be a variable, a numeric value, or a string. The string must be quoted in double quotes and the arguments are separated by commas. If there are no commas, the arguments are concatenated together without distinction. Here, the function of the comma is the same as the delimiter of the output file, except that the latter is a space.

printf functions, which are basically similar to printf in the C language, can format strings, and when output is complex, printf works better and the code is easier to understand. awk Programming Variables and Assignments

In addition to the built-in variables of awk, awk can also customize variables.

1, the following statistics/etc/passwd account number

awk ' {count++; print$0;} End{print "User Count is", count} '/etc/passwd

Root:x:0:0:root:/root:/bin/bash

......

User Count is 40

Count is a custom variable. Before the action{} is only one print, in fact, print is only a statement, and action{} can have multiple statements, separated by a.


2, there is no initialization of count, although the default is 0, but the appropriate approach is to initialize to 0:

awk ' BEGIN {count=0;print ' [Start]usercount is ', count} {count = Count+1;print $} End{print "[End]usercount is", count} '/etc/passwd

[Start]user count is 0

Root:x:0:0:root:/root:/bin/bash

...

[End]user Count is 40


3. Statistics the number of bytes occupied by a file under a folder.

Ls-l |awk ' BEGIN {size=0;} {size=size+$5;} End{print "[End]size is", size} '

[End]size is 8657198

If displayed in M:

Ls-l |awk ' BEGIN {size=0;} {size=size+$5;} End{print "[End]size is", size/1024/1024, "M"} '

[End]size is 8.25889 M

Note that statistics do not include subdirectories of folders. Conditional Statement

The conditional statements in awk are drawn from the C language, as in the following declarations:

if (expression) {

Statement

Statement

... ...

}

if (expression) {

Statement

} else {

Statement2;

}

if (expression) {

Statement1;

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.