Awk introduction-z

Source: Internet
Author: User

Awk Introduction

0. awk has three different versions: awk, nawk, and gawk. It is generally gawk.

1. The most basic function of the awk language is to extract information based on specified rules in files or strings, or to output data based on specified rules. A complete awk script is usually used to format information in a text file.

2. Three Methods to call awk

1) awk [opion] 'awk _ script' input_file1 [input_file2...]

Common options of awk include;

①-F fs: use FS as the field separator of the input record. If this option is omitted, awk uses the environment variable IFS value.

②-F filename: Read awk_script from file filename

③-V Var = value: Set the variable for awk_script

2) Put awk_script into the script file and use #! /Bin/awk-F is the first line, giving the script executable permission, and then calling it by typing the Script Name of the script in shell.

3) insert all awk_scripts into a separate script file, and then call: awk-F awk script file input_file (s)

3. awk running process

1) Composition of awk_script:

① Awk_script can be composed of one or more awk_cmd. The two awk_cmd files are generally separated by newline.

② Awk_cmd consists of two parts: awk_pattern {actions}

③ Awk_script can be divided into multiple lines for writing. Make sure that the entire awk_script is enclosed in single quotes.

2) the general form of the awk command:

Awk 'in in {actions}

Awk_pattern1 {actions}

............

Awk_patternn {actions}

End {actions}

'Inputfile

Begin {actions} and end {actions} are optional.

3) awk running process:

① If the in block exists, awk executes the specified actions.

② Awk reads a row from the input file, which is called an input record. (If the input file is omitted, it will be read from the standard input)

③ Awk splits the read records into fields, puts 1st fields into the variable $1, 2nd fields into $2, and so on. $0 indicates the entire record. The field separator is specified by the Shell environment variable IFS or by the parameter.

④ Compare the current input records with awk_pattern in each awk_cmd to see if they match. If they match, execute the corresponding actions. If they do not match, the corresponding actions will be skipped until all awk_cmd commands are compared.

⑤ When an input record compares all awk_cmd, awk reads the next line of the input and repeats steps ③ and ④ until awk reads the end of the file.

⑥ After awk finishes reading all input rows, if end exists, it executes the corresponding actions.

4) iput_file can be a list of more than one file. awk processes each file in the list in order.

5) awk_pattern of an awk_cmd can be omitted. If it is omitted, the corresponding actions will be executed without matching and comparing the input records. The actions of an awk_cmd statement can also be omitted. When it is omitted, the default action is to print the current input record (Print $0 ). Awk_pattern and actions in an awk_cmd cannot be ignored at the same time.

6) The begin block and end block are not at the beginning and end of awk_script. In awk_script, only the end block or only the begin block are allowed. If awk_script only contains begin {actions}, awk will not read input_file.

7) awk reads the data in the input file into the memory and then operates the input data copy in the memory. awk will not modify the content of the input file.

8) awk always outputs to the standard output. If you want the awk to output to the file, you can use redirection.

4. awk_pattern

The awk_pattern mode determines when and when the actions action is triggered. Awk_pattern can be of the following types:

1)
The regular expression is used as awk_pattern:/Regexp/

① Common characters used in regular expression matching in awk:

\ ^ $. [] | () * // Common Regexp metacharacters

+: Match the previous single character more than once. It is a metacharacter of awk and is not applicable to grep or sed.

? : Match the previous single character once or 0. It is a metacharacter of awk and is not applicable to grep or sed.

② Example:

Awk '/* \ $0 \. [0-9] [0-9]. */'input_file

2) the Boolean expression is used as awk_pattern. When the expression is established, the corresponding actions execution is triggered.

① Variables (such as field variables $1 and $2) And/Regexp/can be used in the expression/

② Operators in Boolean expressions:

Relational operators: <><>==! =

Matching OPERATOR: Value ~ /Regexp/returns true if value matches/Regexp/

Value !~ /Regexp/If the value does not match/Regexp/, true is returned.

Example: awk '$2> 10 {print "OK"} 'input_file

Awk '$3 ~ /^ D/{print "OK"} 'input_file

③ & (And) and | (OR) can be connected to two/Regexp/or boolean expressions to form a hybrid expression .! (Not) can be used in a Boolean expression or before/Regexp.

Example: awk '($1 <10) & ($2> 10) {print "OK"} 'input_file

Awk '/^ d/|/x $/{print "OK"} 'input_file

④ Other expressions are used as awk_script, such as value assignment expressions.

Eg: awk '(TOT + = $6); end {print "total points:" tot} 'input_file // The semicolon cannot be omitted

Awk 'tot + = $6 {print $0} end {print "total points:" tot} 'input_file // equivalent to the above

Examples of awk usage:

Variable name meaning

Number of argc command line Variables

Argv command line meta Array

Filename current input file name

Number of records in the current FNR File

The input field delimiter of FS. The default Delimiter is a space.

RS input record delimiter

Number of domains in the current NF record

No. of NR records so far

OFS output domain Separator

ORS output record Separator

1. awk '/101/'file: The file contains 101 matching rows.

Awk '/101/,/105/' file

Awk '$1 = 5' File

Awk '$1 = "CT"' file must contain double quotation marks

Awk '$1 * $2> 100' file

Awk '$2> 5 & $2 <= 15' file

2. awk '{print NR, NF, $1, $ NF,}' file displays the current file record number, number of fields, and the first and last fields of each row.

Awk '/101/{print $1, $2 + 10}' file displays the second field of the matching row of the file plus 10.

Awk'/101/{print $1 $2} 'file

Awk '/101/{print $1 $2}' file displays the first and second fields of the matching row of the file, but it shows that there is no Separator in the middle of the time domain.

3. df | awk '$4> 1000000' is input through a pipeline operator. For example, a row with 4th fields meeting the conditions is displayed.

4. awk-F "|" '{print $1}' file is operated according to the new separator "|.

Awk 'in in {FS = "[: \ t |]"}

{Print $1, $2, $3} 'file modifies the input Separator by setting the input separator (FS = "[: \ t |.

SEP = "|"

Awk-F $ Sep '{print $1}' file uses the environment variable Sep value as the separator.

Awk-F' [: \ t |] ''{print $1} 'file uses the value of the regular expression as the separator. Here, space,:, tab, and | are used as the separator at the same time.

Awk-F' [] [] ''{print $1} 'file uses the value of the regular expression as the separator, which indicates [,]

5. The awk-F awkfile file is sequentially controlled by the awkfile content.

Cat awkfile

/101/{print "\ 047 hello! \ 047 "} -- print 'Hello! '. \ 047 represents single quotes.

{Print $1, $2} -- because there is no mode control, print the first two fields of each row.

6. awk '$1 ~ /101/{print $1} 'file: the first field in the file matches 101 rows (records ).

7. awk 'in in {OFS = "% "}

{Print $1, $2} 'file modifies the output format by setting the output separator (OFS = "%.

8. awk 'in in {max = 100; print "max =" max} begin indicates the operation performed before any row is processed.

{Max = ($1> Max? $1: Max); print $1, "Now Max is" max} 'file gets the maximum value of the first domain of the file.

(Expression 1? Expression 2: expression 3 is equivalent:

If (expression 1)

Expression 2

Else

Expression 3

Awk '{print ($1> 4? "High" $1: "low" $1)} 'file

9. awk '$1 * $2> 100 {print $1}' file indicates that the first domain in the file matches 101 rows (records ).

10. awk '{$1 = 'chi' {$3 = 'China '; print} 'file: Find the matching row, replace the first 3rd fields, and then display the row (record ).

Awk '{$ 7% = 3; print $7}' file divides the 7th domain by 3, assigns the remainder to the 7th domain, and then prints it.

11. awk '/Tom/{Wage = $2 + $3; printf wage}' file: Find the matching row, assign a value to the variable wage, and print the variable.

12. awk '/Tom/{count ++ ;}

End {print "Tom was found" count "times"} 'file end indicates processing after all input rows are processed.

13. awk 'gsub (/\ $/, ""); gsub (/,/, ""); Cost + = $4;

End {print "the total is $" cost> "FILENAME"} 'file gsub function replaces $ and with an empty string, and then outputs the result to filename.

1 2 3 $1,200.00

1 2 3 $2,300.00

1 2 3 $4,000.00

Awk '{gsub (/\ $/, ""); gsub (/,/,"");

If ($4> 1000 & $4 <2000) C1 + = $4;

Else if ($4> 2000 & $4 <3000) C2 + = $4;

Else if ($4> 3000 & $4 <4000) C3 + = $4;

Else C4 + = $4 ;}

End {printf "C1 = [% d]; C2 = [% d]; C3 = [% d]; C4 = [% d] \ n", C1, C2, c3, C4} "'file

Use if and else if to complete the Condition Statement

Awk '{gsub (/\ $/, ""); gsub (/,/,"");

If ($4> 3000 & $4 <4000) exit;

Else C4 + = $4 ;}

End {printf "C1 = [% d]; C2 = [% d]; C3 = [% d]; C4 = [% d] \ n", C1, C2, c3, C4} "'file

Exit is used to exit when a condition is specified, but the end operation is still executed.

Awk '{gsub (/\ $/, ""); gsub (/,/,"");

If ($4 & gt; 3000) next;

Else C4 + = $4 ;}

End {printf "C4 = [% d] \ n", C4} "'file

Use next to skip this row in case of a condition and perform operations on the next row.


14. awk '{print filename, $0}' file1 file2 file3> fileall writes all contents of file1, file2, and file3 to fileall. The format is

Print the file and the file name.

15. awk '$1! = Previous {close (previous); previous = $1}

{Print substr ($0, index ($0, "") + 1)> $1} 'fileall splits the merged file into three files. And is consistent with the original file.

16. awk 'in in {"date" | Getline D; print d} 'sends the execution result of date to Getline through the pipeline, assigns it to the variable D, and then prints it.

17. awk 'in in {system ("Echo" input your name: \ c ""); Getline D; print "\ Nyour name is", D, "\ B! \ N "}'

Use the Getline command to enter and display the name.

Awk 'in in {FS = ":"; while (Getline <"/etc/passwd"> 0) {if ($1 ~ "050 [0-9] _") Print $1 }}'

Print the username in the/etc/passwd file that contains the 050x _ username.

18. awk '{I = 1; while (I <NF) {print NF, $ I; I ++}' file loops through the while statement.

Awk '{for (I = 1; I <NF; I ++) {print NF, $ I}' file loops through the for statement.

Type file | awk-F "/"'

{For (I = 1; I <NF; I ++)

{If (I = NF-1) {printf "% s", $ I}

Else {printf "% S/", $ I }}' shows the full path of a file.

Display date with for and if

Awk 'in in {

For (j = 1; j <= 12; j ++)

{Flag = 0;

Printf "\ n % d month \ n", J;

For (I = 1; I <= 31; I ++)

{

If (j = 2 & I> 28) Flag = 1;

If (j = 4 | j = 6 | j = 9 | j = 11) & I> 30) Flag = 1;

If (flag = 0) {printf "% 02d % 02d", J, I}

}

}

}'

19. The system variable to be called in awk must be enclosed in single quotation marks. If it is double quotation marks, it indicates a string.

Flag = ABCD

Awk '{print' $ flag'} 'returns ABCD

Awk '{print "$ flag"}' returns $ flag


Http://apps.hi.baidu.com/share/detail/6533091

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.