How to use the awk command of the shell under Linux

Source: Internet
Author: User
Tags hash numeric value regular expression

Awk is a very powerful text analysis tool in Linux. In a nutshell, awk reads the file line by row, dividing each row into separate sections with whitespace as the default separator, and then doing a variety of analytical processing.

Basic usage of awk

The basic forms used by awk are as follows

awk ' {pattern + action} ' {filenames}

where pattern represents what AWK looks for in the data, and the action is a series of commands that are executed when the matching content is found. Curly braces ({}) do not need to appear in the program at all times, but they are used to group a series of instructions according to a specific pattern. pattern is the regular expression to be represented, surrounded by slashes.

In use, we will generally adopt the following usage

awk [f field-separator] ' commands ' input-file (s)

Where commands is the true awk command, [-f field separator] is optional. Input-file (s) is the file to be processed. In awk, each item in a file, separated by a domain delimiter, is called a domain. In general, the default field delimiter is a space without naming the-f field separator.
Typically, awk handles units as an act of a file. awk processes the text every single line that receives the file, and then executes the appropriate command.

Example of using awk

Take the/etc/passwd file as an example, and when you execute the CAT/ETC/PASSWD command, you get the following, which takes only the first 4 lines

# CAT/ETC/PASSWD
Root:x:0:0:root:/root:/bin/bash
Daemon:x:1:1:daemon:/usr/sbin:/bin/sh
Bin:x:2:2:bin:/bin:/bin/sh
Sys:x:3:3:sys:/dev:/bin/sh

1. Awk+action usage

We use the awk command to take out the account name and get the following output

#cat/etc/passwd |awk-f ': ' {print $} '
Root
Daemon
Bin
Sys

Explain the meaning of the awk command above: Read a record with ' \ n ' newline character split (read-by-line), the record is then delimited by the specified field separator (-f Specifies the field delimiter ': '), executes the command (print $), and $ $ represents all fields, the first field, $n represents the nth domain. The default Domain delimiter is the blank key or the [tab] key, so the user name is represented, followed by, and so on.

So if you want to print both the/ETC/PASSWD account and the corresponding shell of your account, and then divide it with commas, you can use the following command

# cat/etc/passwd |awk-f ': ' {print $ ', ' $} '
Root,/bin/bash
Daemon,/bin/sh
Bin,/bin/sh
Sys,/bin/sh

2. Awk+pattern usage

Search for all rows with the root keyword/etc/passwd

#awk-F: '/root/'/etc/passwd
Root:x:0:0:root:/root:/bin/bash

This is an example of pattern usage, where the line that matches the pattern (here is root) executes the action (no action is specified and the content of each row is output by default).
The matching pattern is usually written in/pattern/, i.e.

awk '/pattern/'

Search support Regular, for example to start with root: awk-f: '/^root/'/etc/passwd

3. Awk+pattern+action Usage

Search for all lines with the root keyword in the/etc/passwd and display the corresponding shell


# awk-f: '/root/{print $} '/etc/passwd
/bin/bash

Action{print $} was specified here

The extended usage of awk

1. awk Built-in variables

Number of ARGC command line arguments
ARGV Command line parameter arrangement
The use of system environment variables in ENVIRON support queues
FileName awk Browse file name
Number of records FNR browsing files
FS Set input field separator, equivalent to command line-f option
NF browsing the number of fields recorded
The number of records that NR has read
OFS Output Field Separator
ORS Output Record Separator
RS Control Record Separator

Here are some simple things to use:

1, output file second line

awk ' nr==2 '

2, the output file the second line to line fourth

awk ' nr==2,nr==4 '

3, delete all the blank lines

awk NF

4, the last line of the output file

awk ' End {print} '

2, print and printf

The functions of print and printf two kinds of printouts are also available in awk.

Where the print function argument can be a variable, a numeric value, or a string. The string must be quoted in double quotes and the arguments are separated by commas. If there are no commas, the arguments are concatenated together without distinction. Here, the function of the comma is the same as the delimiter of the output file, except that the latter is a space.

printf functions, which are basically similar to printf in the C language, can format strings, and when output is complex, printf works better and the code is easier to understand.

AWK programming

Variables and Assignments

In addition to the built-in variables of awk, awk can also customize variables.

The following statistics/etc/passwd account number

awk ' {count++;p rint $} End{print "User Count is", count} '/etc/passwd
Root:x:0:0:root:/root:/bin/bash
......
The user count is 40count is a custom variable. Before the action{} is only one print, in fact, print is only a statement, and action{} can have multiple statements, separated by a.

Count is not initialized here, although the default is 0, the proper approach is to initialize to 0:

awk ' BEGIN {count=0;print ' [Start]user count is ', count} {count=count+1;print $} End{print "[End]user Count is", count} '/etc/passwd
[Start]user count is 0
Root:x:0:0:root:/root:/bin/bash
...
[End]user Count is 40

Count the number of bytes in a file under a folder

Ls-l |awk ' BEGIN {size=0;} {size=size+$5;} End{print "[End]size is", size} ' [End]size is 8657198

If displayed in M:

Ls-l |awk ' BEGIN {size=0;} {size=size+$5;} End{print "[End]size is", size/1024/1024, "M"} ' [End]size are 8.25889 m note that statistics do not include subdirectories of folders.

Conditional statement

The conditional statements in awk are drawn from the C language, as in the following declarations:


if (expression) {
Statement
Statement
... ...
}

if (expression) {
Statement
} else {
Statement2;
}

if (expression) {
Statement1;
else if (expression1) {
Statement2;
} else {
Statement3;
}

Counts the number of bytes of files under a folder, filtering files of 4096 sizes (typically folders):

Ls-l |awk ' BEGIN {size=0;print ' [start]size is ', size} {if ($5!=4096) {size=size+$5;}} End{print "[End]size is", size/1024/1024, "M"} ' [End]size is 8.22339 m

Loop statement

The looping statements in awk also refer to the C language and support while, Do/while, for, break, and continue, which are semantically identical to the semantics of the C language.

Array

Because the subscripts of an array in awk can be numbers and letters, the subscript of an array is often called a keyword. Both values and keywords are stored inside a table that applies a hash to the key/value. Because the hash is not sequential, it is shown that the contents of the array are not displayed in the order that you expect. Arrays and variables are created automatically when they are used, and awk automatically determines whether they store numbers or strings. In general, an array in awk is used to gather information from records, to calculate totals, to count words, and to track how many times a template is matched, and so on.

Show/ETC/PASSWD's account


Awk-f ': ' BEGIN {count=0} {Name[count] = $1;count++; End{for (i = 0; i < NR; i++) print I, Name[i]} '/etc/passwd
0 Root
1 daemon
2 bin
3 SYS
4 Sync
5 games

...... This uses the For loop to traverse the array

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.