The usage _c language of the Linux awk command

Source: Internet
Author: User

Let's start with an example:
File A, which is the average number of floating-point numbers in the first column of the file a that are floating point numbers. You can do it in awk, just one word.
$cat A
1.021 33
1#.ll 44
2.53 6
SS 7

awk ' begin{total = 0;len = 0} {if ($1~/^[0-9]+\.[ 0-9]*/) {Total + = $ len++}} end{print Total/len} ' a
(Analysis: $1~/^[0-9]+\.) [0-9]*/represents a match to the regular expression in '//', and if it matches, total plus $, and Len is the number plus 1. ^[0-9]+\. [0-9]* is a regular expression, and "^[0-9]" indicates the beginning of a number, "\." The meaning of an escape means "." The meaning of a decimal point. "[0-9]*" represents 0 or more digits)

The general syntax format for awk is:
awk [-parameter variable] ' begin{initialization} condition type 1{action 1} condition type 2{action 2} .... end{after processing} '
Where: The statements in begin and end play a role after the file is read (In_file) and after the file is read, which can be understood as initialization and cleanup.

(1) Parameter description:
-F Re: Allow awk to change its field separator
-V var= $v assigns a V value to Var, and if there are multiple variables to be assigned, write multiple-V, and each variable assignment corresponds to a-V
e.g. to print the line between the NUM line of file A and the Num+num1 line,
Awk-v num= $num-v num1= $num 1 ' nr==num,nr==num+num1{print} ' a
-F Progfile: Allows awk to invoke and execute Progfile Program files, of course Progfile must be a program file that conforms to the awk syntax.

(2) awk built-in variables:
Number of ARGC command line arguments
ARGV array of command line arguments
Argind the argv identifier of the file currently being processed
E.g has two documents A and B
awk ' {if (argind==1) {print ' Process a file '} if (argind==2) {print "Process B file"}} ' a B
The sequence of file processing is to scan a file first, then scan B file

The number of records that NR has read out
FNR the number of records in the current file
The example above can also be written like this:
awk ' nr==fnr{print ' process file a '} NR > fnr{print ' processing file B '} ' a B
Input files A and B, because first scan a, so scan a must have NR==FNR, and then scan B, FNR start counting from 1, while NR is followed by a number of lines continue to count, so nr > FNR

E.g to show line 10th to line 15th of the file
awk ' Nr==10,nr==15{print} ' a

FS input Field separator (default: space:), equivalent to-f option
Awk-f ': ' {print} ' A and awk ' begin{fs= ': '}{print} ' A is the same

OFS output field separator (default: space:)
Awk-f ': ' begin{ofs= ';} {print $1,$2,$3} ' b
If Cat B is
1:2:3
4:5:6
Then set the OFS to ";" will then output
1;2;3
4;5;6
(Small note: awk takes the segmented 1th, 2, 3 fields with $1,$2,$3 ...), and $ represents the entire record (usually a whole line).

NF: Number of fields in current record
The output of awk-f ': ' {print NF} ' B is
3
3
Indicates that each row of B is divided into 3 fields with the delimiter ":"
You can use NF to control the lines of output that match the required number of fields, so that you can dispose of some unusual rows
Awk-f ': ' {if (NF = = 3) print} ' b

RS: Enter the record separator, the default is \ n
By default, Awk sees a row as a record, and if RS is set, awk splits the records by RS
For example, if the file C,cat C is
Hello world; I want to go swimming Tomorrow;hiahia
Run awk ' begin{RS = '; '} The result of {print} ' C is
Hello World
I want to go swimming tomorrow
Hiahia
Reasonable use of RS and FS allows awk to handle more schema documents, such as multiple rows at a time, such as the output of document D Cat D
1 2
3 4 5

6 7
8 9 10
11 12

Hello
Each record uses a blank row split, and each field is split with a newline character, which is also good for awk.
awk ' begin{FS = \ n '; RS = ""} {print NF} ' d output
2
3
1

ORS: Output record delimiter, default to line break, control output symbol after each print statement
awk ' begin{FS = \ n '; RS = ""; ORS = ";"} {print NF} ' d output
2;3;1

(3) Awk reads the variables in the shell
You can use the-V option to implement features
$b =1
$cat F
Apple

$awk-V var= $b ' {print var, $var} ' F
1 apple
As for whether there is a way to pass the variables in awk to the shell, I understand this question. The shell call awk actually fork a subprocess, and the child process cannot pass the variable to the parent process unless it is redirected (including pipelines)
a=$ (awk ' {print $b, ' $b '} ' F)
$echo $a
Apple 1

(4) Output redirect

Awk's output redirection is similar to the shell's redirection. The redirected destination filename must be quoted in double quotes.
$awk ' $ >=70 {print $1,$2 > ' destfile '} ' filename
$awk ' $ >=70 {print $1,$2 >> ' destfile '} ' filename

(5) Invoke the shell command in awk:

1) Use of pipelines
The piping concept in awk is similar to the shell pipe, using the "|" Symbol. If you open a pipe in an awk program, you must close the pipe before you can open another pipe. This means that only one pipe can be opened at a time. The shell command must be quoted in double quotes. "If you intend to use a file or pipe to read and write again in your awk program, you may want to close the program first because the pipes in it remain open until the script runs out." Note that once the pipe is opened, it remains open until awk exits. Therefore, the statement in the end block also receives the effect of the pipeline. (You can close the pipe on the first line of end) "
There are two types of syntax for using a pipe in awk, respectively:
awk Output | Shell input
Shell Output | awk input

For awk Output | Shell input, the shell receives awk's output and processes it. It should be noted that awk output is first slow in the presence of the pipe, and so on, and then call the shell command processing, Shell command processing only once, and the timing of the process is "awk program at the end, or when the pipe is closed (you need to close the pipe explicitly)"
$awk '/west/{count++} {printf '%s%s\t\t%-15s\n ', $3,$4,$1 | "Sort +1"} end{close "sort +1"; printf "the number of sales pers in the Western"; printf "region is" count. "} ' Datafil E (Explanation:/west/{count++} indicates a match with "Wes" T, and if so, count is self-added)
The printf function is used to format the output and send it to the pipe. All output sets are sent together and sent to the sort command. You must close the pipe (sort + 1) with exactly the same command you opened, otherwise the statements in the end block will be sorted with the preceding output. The sort command here executes only once.

In Shell output | The input of awk in awk input can only be the Getline function. The result of the shell execution is cached in pipe and sent to awk, and if there is more than one line of data, Awk's getline command may be invoked multiple times.
$awk ' begin{while (("ls" | getline d) > 0) Print D} ' F

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.