Linux awk command details, awk command details

Source: Internet
Author: User

Linux awk command details, awk command details

Awk is a great digital processing tool. Compared with sed, awk tends to divide a row into several "fields" for processing. It features high operation efficiency, simple code, and powerful processing capabilities for formatted text. Here is an example:
In File a, the first column of Statistical File a is the average value of the Floating Point Number of the row of the floating point number. Awk can be implemented in just one sentence.
$ Cat
1.021 33
1 #. ll 44
2.53 6
Ss 7

Awk 'in in {total = 0; len = 0} {if ($1 ~ /^ [0-9] + \. [0-9] */) {total + = $1; len ++} END {print total/len} 'a
(Analysis: $1 ~ /^ [0-9] + \. [0-9] */indicates that $1 matches the Regular Expression in "//". If it matches, $1 is added to total and len is auto-incremented, that is, add 1 to the number. "^ [0-9] + \. [0-9] * "is a regular expression." ^ [0-9] "indicates that it starts with a number," \. "is the meaning of escape, indicating". "indicates the decimal point. "[0-9] *" indicates 0 or multiple numbers)

The general syntax format of awk is:
Awk [-Parameter Variable] 'begin{ initialization} Condition Type 1 {Action 1} Condition Type 2 {Action 2 }.... END {post-processing }'
The statements in BEGIN and END play a role before and after reading the file (in_file). They can be understood as initialization and scanning.
(1) parameter description:
-F re: Allows awk to change its field separator
-V var = $ v: Assign the value of v to var. If multiple variables need to be assigned a value, write multiple-v values. Each variable corresponds to one-v value.
E.g. Print the row from row num of file a to row num + num1,
Awk-v num = $ num-v num1 = $ num1 'nr = num, NR = num + num1 {print} 'a
-F progfile: allows the awk to call and execute the progfile program file. Of course, the progfile must be a program file that complies with the awk syntax.

(2) awk built-in variables:
ARGCNumber of command line parameters
ARGVCommand Line Parameter Array
ARGINDARGV identifier of the currently processed file
E. g has two files, a and B.
Awk '{if (ARGIND = 1) {print "processing file a"} if (ARGIND = 2) {print "processing file B"} 'a B
The order of file processing is to scan file a before scanning file B.

NRNumber of records read
FNRNumber of records of the current file
The preceding example can also be written as follows:
Awk 'nr = FNR {print "processing file a"} NR> FNR {print "processing file B"} 'a B
Input files a and B. Because a is scanned first, there must be NR = FNR when scanning a, and then FNR starts counting from 1 when scanning B, while NR continues to count the number of rows of a, so NR> FNR

E. g. To display lines 10th to 15th of the file
Awk 'nr = 10, NR = 15 {print} 'a

FSInput field separator (default: space :), equivalent to-F Option
Awk-F': ''{print} 'a and awk' BEGIN {FS =": "} {print} 'a are the same

OFSOutput field separator (default: space :)
Awk-F': ''begin {OFS = ";"} {print $1, $2, $3} 'B
If cat B is

PM
Set OFS to ";" and then output
1; 2; 3
4; 5; 6
(Small note: awk uses $1, $2, $3 for the 1st, 2, and 3 fields after the split... $0 indicates the entire record (usually a whole row ))

NF: Number of fields in the current record
Awk-F': ''{print NF} 'B's output is
3
3
It indicates that each line of B is separated by a separator ":", and all three fields are separated.
NF can be used to control the number of rows that meet the requirements, so that abnormal rows can be processed.
Awk-F': ''{if (NF = 3) print} 'B

RS: The input record delimiter. The default value is "\ n"
By default, awk regards a row as a record. If RS is set, awk splits the record according to RS.
For example, if the file c, cat c is
Hello world; I want to go mongoming tomorrow; hiahia
The result of running awk 'in in {RS = ";"} {print} 'C is
Hello world
I want to go mongoming tomorrow
Hiahia
The rational use of RS and FS enables awk to process more multi-mode documents. For example, it can process multiple lines at a time. For example, the output of document d cat d is
1 2
3 4 5

6 7
8 9 10
11 12

Hello
Each record is separated by blank lines, and each field is separated by line breaks. This awk is also very easy to write.
Awk 'in in {FS = "\ n"; RS = ""} {print NF} 'd output
2
3
1

ORS: Output record delimiter. The default is a line break, which controls the output symbols after each print statement.
Awk 'in in {FS = "\ n"; RS = ""; ORS = ";"} {print NF} 'd output
2; 3; 1
(3) awk reads the variables in shell
You can use the-v option to implement the function.
$ B = 1
$ Cat f
Apple

$ Awk-v var = $ B '{print var, $ var}' f
1 apple
Is there any way to pass the variables in the awk to the shell? This is my understanding. Shell calls to awk are actually fork a sub-process, and the sub-process cannot pass variables to the parent process unless redirection (including pipelines) is used)
A =$ (awk '{print $ B,' $ B '}' f)
$ Echo $
Apple 1

(4)Output redirection

Awk output redirection is similar to shell redirection. The target file name to be redirected must be referenced in double quotation marks.
$ Awk '$4 >=70 {print $1, $2> "destfile"}' filename
$ Awk '$4 >=70 {print $1, $2> "destfile"}' filename

(5) Call the shell command in awk:

1) UseMPs queue
The concept of MPs queue in awk is similar to that of the shell MPs queue. The "|" symbol is used. If an MPS queue is opened in the awk program, you must close the MPs queue to open another MPs queue. That is to say, only one MPs queue can be opened at a time. Shell commands must be referenced in double quotes. "If you plan to use a file or pipeline for read/write again in the awk program, you may need to close the program first because the pipeline will remain open until the script is running. Note that once the MPs queue is opened, it will remain open until the awk exits. Therefore, the statements in the END block will also be affected by the pipeline. (You can close the pipeline in the first line of END )"
There are two syntaxes for using pipelines in awk:
Awk output | shell input
Shell output | awk input

For awk output | shell input, shell receives and processes the awk output. Note that the output of awk is first cached in pipe. After the output is completed, the shell command is called for processing. The shell command is only processed once, and the processing time is "when the awk program ends, or when the pipeline is closed (you need to explicitly close the pipeline )"
$ Awk '/west/{count ++} {printf "% s \ t %-15s \ n", $3, $4, $1 | "sort + 1"} END {close "sort + 1"; printf "The number of sales pers in the western"; printf "region is" count ". "} 'datafile (explanation:/west/{count ++} indicates that it matches" wes "t. If it matches, the count is automatically increased)
The printf function is used to format the output and send it to the MPs queue. All output sets are sent to the sort command. The pipeline (sort + 1) must be closed with the same command as when it was opened; otherwise, the statements in the END block will be sorted together with the previous output. The sort command is executed only once.

In shell output | awk input, awk input can only be a getline function. The results of shell execution are cached in pipe and then transmitted to awk for processing. If there are multiple lines of data, the getline command of awk may be called multiple times.
$ Awk 'in in {while ("ls" | getline d)> 0) print d} 'f

 

(5) Use awk in combination with match to extract matching strings:

 

File Content:

<Key> hostname </key>

<Value> 0.0.0.0 </key>

0.0.0.0 is required

Command:

Cat filename | grep-A 1 "hostname" | awk 'match ($0, "<value> (. *) </value> ", a) {print a [1]}'

Grep-A n indicates that n rows are output multiple times.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.