Introduction:
Grep, sed, and awk are three mainstream text processors, but they have their own advantages and disadvantages in processing. Here we will only introduce awk
Awk is an excellent text processing tool. It is not only one of the most powerful Data Processing engines in Linux, but also in any environment. The maximum functionality of this programming and data operation language (its name is derived from the first letter of its founder, Alfred Aho, Peter Weinberger, and Brian kernighan) depends on a person's knowledge. Awk provides extremely powerful functions: style loading, flow control, mathematical operators, process control statements, and even built-in variables and functions. It has almost all the exquisite features of a complete language. In fact, awk does have its own language: awk programming language. The three creators have formally defined it as "style scanning and processing language ". It allows you to create short programs that read input files, Sort data, process data, perform calculations on input, and generate reports. There are countless other functions.
I. Basic syntax
Awk [Options] 'signature' file ..
Awk [Options] 'pattern' {action} 'file file...
Output Format: Print Item1, item2...
(1) Each item is separated by commas (,), leading to spaces. If displayed with spaces, no space is required.
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899581K86e.png "" 527 "Height =" 159 "/>
(2) Each output item can be a string, the current record field, variable or awk expression; the value is implicitly converted to a string and then output
(3) If the item after the print is omitted, it is equivalent to print $0: Output all
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899581SCbg.png "" 544 "Height =" 72 "/>
========================================================== ======================================
Ii. awk Variables
Built-in variables and custom Variables
(1) built-in Variables
FS: filed seperator |
Input Field Segmentation Fu |
RS: |
Input line delimiter |
OFS |
Field delimiter for output |
ORS |
Line delimiter for output |
NF |
Number of fields |
NR |
The number of rows. All files are counted together. |
FNR |
Number of rows, counting each file |
Argv |
Array to save the characters of the command itself |
Argc |
Number of awk command parameters saved |
Filename |
Name of the current file that awk is processing |
The following example shows the differences:
(A) use different segments
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899581iiW8.png "" 653 "Height =" 134 "/>
(B) counting the number of file lines
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899582Urro.png "" 661 "Height =" 207 "/>
(C) Count each file separately
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899582MR4o.png "" 669 "Height =" 98 "/>
(D) Use arrays to save the characters of command skills
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_140889958259bX.png "" 658 "Height =" 168 "/>
(E) Number of command parameters to be saved
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899582zQOy.png "" 637 "Height =" 75 "/>
(F) display the name of the file system that awk is currently processing
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899582gfOJ.png "" 535 "Height =" 159 "/>
========================================================== ======================================
(2) custom Variables
-V var_name = value // variable names are case sensitive
A. It can be defined in program.
B. You can use the-V option in the command line to customize
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899583O3NF.png "" 671 "Height =" 107 "/>
Iii. Usage of awk printf
Format: printf format, Item1, item2 ....
To specify format
Will not automatically wrap, use \ n for line feed
Format is used to specify the output format for each item
(1) The Format Indicator starts with %, followed by a character
% C |
ASCII code for displaying characters |
% D, % I |
Decimal integer |
% E |
Scientific notation |
% F |
Show floating point number |
% G |
Display numeric values in scientific notation or floating point Number Format |
% S |
Display string |
% U |
Show unsigned integers |
% |
Display % itself |
Convert character to ASCII code
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899583SnXb.png "" 506 "Height =" 115 "/>
Print as decimal
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899583HiDj.png "" 547 "Height =" 94 "/>
Converting a certain number to a scientific counting format
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899583dHzo.png "" 570 "Height =" 82 "/>
Print string
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899583aGlz.png "" 545 "Height =" 108 "/>
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899583WEjO.png "" 557 "Height =" 78 "/>
Count by floating point number
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899584SJmg.png "" 558 "Height =" 82 "/>
The rest will not be demonstrated one by one.
========================================================== ==================
(2) awk output redirection
>,>>,|
Special file descriptor:
/Dev/stdin: Standard Input
/Dev/stdout: standard output
/Dev/stderr: Error output
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899584Thll.png "" 611 "Height =" 90 "/>
========================================================== ======================
(3) awk Operators
Arithmetic Operators: +,-, *, **, %, positive and negative values: + X,-x
String OPERATOR: Join
Value assignment operator: =, + =,-=, * =,/=, % =, ^ =, ** =, ++ ,--
If the mode itself is an equal sign, write it as/=/
Comparison OPERATOR: <=>==! *~ !~
Logical OPERATOR: & |
(4) conditional expressions
Selector? If-ture-ezpression: If-False-expression
Condition? True output content: false output content
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899584kgWt.png "" 696 "Height =" 79 "/>
Iv. function call
Function_name (argu1, argy2)
(1) Regexp: format/pattern // only process the matched/pattern rows
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899584bSU5.png "" 640 "Height =" 99 "/>
(2) expression: expression. If the result is not 0 or a non-null string, the following conditions are met:
Only the rows that meet the conditions are processed.
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899584Ey02.png "" 629 "Height =" 137 "/>
(3) ranges
Only process rows in the specified range
(4) begin/end: special mode. It is executed only once before or after running the program of the awk command.
(5) Empty: NULL mode, matching all rows in the file
5. control statements
(1) Format: If (condition) {then body} [{else body}]
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899584TU89.png "" 572 "Height =" 108 "/>
(2) Format: While (condition) {while body}
# [Email protected] ~] # Awk '{I = 1; while (I >= NF) {if ($ I >= 20000) Print $ I; I ++} 'test.txt
(3) Format: do {do-while body} while (condition)
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899585TrrJ.png "" 602 "Height =" 71 "/>
(4) Format: For (variable assignment; condition; iteration process) {for body}
# Awk-F: '{for (I = 1; I <= 3; I ++) {If (length ($ I)> = 8) {print $ I }}'/etc/passwd
* ** The for loop can be used to traverse array elements. Syntax: for (I in array) {for body}
# Awk-F: '$ NF !~ /^ $/{Bash [$ NF] ++} end {for (a in bash) {printf "%-15 s: % I \ n",, bash [a]} '/etc/passwd
(5) Switch (expression) {Case value or/rgeexp/: statement1;... Default: statementn}
(6) Break continue Loop Control
(7) Next: end the processing annotation of the bank before entering the next line.
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_140889958556W4.png "" 652 "Height =" 128 "/>
Vi. Application of awk to Arrays
Join Array
Array [index-expression]
Index-expression: any string can be used. If an array element does not exist before the current event is referenced, awk will automatically create this element and initialize it as an empty string, therefore, to determine whether the array has this element, you must use the index in array format.
A [first] = "Hello awk"
Print a [second]
To traverse each element in the array, use the following special structure:
For (VAR in array) (for body)
VaR traverses the index of the array.
State [Listen] ++
State [Listen] ++
650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899585TgyV.png "" 553 "Height =" 306 "/>
Delete array elements: You need to delete the index of the array from the relational array.
Delete array [Index]
VII. built-in functions of awk
1. Split (string, array [, fieldsep [, SEPs]);
Function: Slice string with fielfsep as the delimiter and save the result in the array.
2. ength ([String])
Function: returns the number of characters in a string;
Substr (string, start [, length])
Function: Take the substrings in the string string, starting from start and taking the length; Start starts counting from 1;
Tolower (s)
3. Can: Convert all letters in S to lowercase letters
Toupper (s)
Function: converts all letters in S into uppercase letters.
Format processor-awk