How awk Works
The first step: Execute Begin{action ...} Statements in a statement block
Step two: Read a line from the file or standard input (stdin), then execute the pattern{action ...} Statement block, which scans the file row by line, repeating the process from the first line to the last line until the file is fully read.
Step three: When reading to the end of the input stream, execute end{action ...} Statement block
The BEGIN statement block is executed before awk begins to read rows from the input stream, which is an optional block of statements, such as variable initialization, table-top statements for printed output tables, which can usually be written in the BEGIN statement block
The end statement block is executed after awk reads all the rows from the input stream, such as the analysis results for all rows, such as a summary of information that is done in the end statement block, which is also an optional statement block
The General command in the pattern statement block is the most important part and is optional. If the pattern statement block is not provided, the default is {print}, which prints every fetched row, and every row that awk reads executes the statement block
awk prints the report, formats the output document information
Format each field of each row in the file that reads the band, and then display The
supports the use of variables, conditional judgments, loops, arrays
Contos using gawk
Basic syntax:
awk [Options] ' program (awk language) ' variable = Assignment file ...
awk [Options]-F proogramfile (awk language file) var=value file ...
awk [options] ' begin{action; ...} pattern{action; ...} end{action;..};
note: The awk language is usually written in single quotes, (double quotation marks may be called by other languages)
Begin{action;.}: Pre-action, (often used to print out the table header, Used for printing multiplication, variable operations)
Pattern{action;.}:p Attern (for filtering conditions, condition judgments, etc.)
End{action; ...}: The end-of-execution operation, often used to summarize, summarize
action on data in Row processing, put in {} to indicate the row delimiter, column delimiter
option:
-f/path/from/awk_script Read the AWK program file from the file
-F means The field delimiter (cut) used to specify the input
£ º $1,$2 ...: position parameter
-V var=value: Custom variable, built-in variable
common four separators for awk: row and column separators need to be specified manually
Input
Line delimiter (one record is equal to a row, each record is delimited by a row delimiter, used to differentiate)
column separators (field, domain, property specific implementation principle equivalent to row delimiter)
Output
Line delimiter
Field delimiter
awk acceptable input, output, redirection
awk variables: Built-in variables and custom variables
Built-in variables:
FS: Enter field delimiter, default to white space character
OFS: Output field delimiter, default to white space character
RS: Enter a record delimiter to specify the line break at input
ORS: Output record delimiter, output with specified symbol instead of line break
NF: Number of fields
NR: Record number
FNR: Each file counts separately, record number
FileName: Current file name
ARGC: Number of command line arguments
ARGV: An array that holds the arguments given by the command line
Custom variables:
①-v Var=value Direct Assignment
② define and assign values directly in program
Summary:
The call variable in awk does not need $
Awk variables can refer to the values of variables in the shell language
Custom variables, first assignment is used.
Print command: Printing output
attribute: ① comma delimiter
The contents of the ② output can be (string, numeric, current field, variable, awk expression)
③ If the output content is omitted, the default output matches all content
printf command: Format output
attribute: ① must specify format)
② does not wrap and needs to be displayed with newline control
③format in each of the following (contents) Specify format character
①: corresponds to (content) one by one
%c: ASCII code for displaying characters
%d,%i: displaying decimal integers
%e,%e: Displaying scientific notation values
%f: Displaying floating-point numbers
%g,%g: Displaying numeric values in scientific notation or floating point
%s: Display string
%u display symbol integer
percent: show% self
② modifier:
#[.#]: The width of the first digital control display; the second one. Represents the precision after the decimal point
-: Left-justified (default right-aligned) example%-15s
+: Displays a positive and negative sign of a numeric example%+d
③ arithmetic operator:
-x: Convert to Negative
+x: Convert to numeric
④ Word String operators:
⑤ assignment operator:
=, +=,-=,*=,/=,%=,^=
⑥ comparison operator:
==,!=,>,>=,<,<=
⑦ pattern match:
~: The left and right matches contain
~: does not match
⑧ logical operator
with && or | | Non -!
⑨ Regular Expression
⑩ conditional expression (trinocular expression)
⑾ function call
Summary: Pattern commonly used: Filter matching lines, in doing processing
If not specified: matches each row
Address delimitation:/pat1/,/pat2/matches only rows that are within the criteria range
Relational expressions,
Awk is not 0 non-null to 0 and the inverse is 1.
Return value: 0 is false 1 is true
--------------------------------------------------------------------------------------------------------------- ----------------------
awk Advanced Usage: conditional judgment, looping, functions, array detail usage
If-else Control Statements:
Syntax if (condition) {print content}[else print content]
F (condition 1) {print content},else if{condition 2}{Print content}else{print content}
Usage scenario: Make a conditional judgment on the entire row or field that awk obtains
While loop: Enter a loop when the condition is true, condition is false exit loop
Syntax: while (condition) {print content; ...}
Usage Scenario: Use when processing multiple fields in a row one at a time
Used when elements in an array are processed individually
Do-while Loop: Perform a condition judgment first
Syntax: do{print content; ...} while (condition)
For loop:
Common usage:
Syntax: for (initialization statement; condition judgment, condition change) {print content}
Special usage: Ability to traverse elements in an array
Syntax: for (var in array) {For-body}
Swith statement:
Syntax: switch (condition) {case}
An expression executes the following statement if it satisfies the VALUE1 or regexp in the case.
Loop Control statement:
break[#] exit the loop body
continue[#] jump out of this cycle
Next: End the processing of the bank prematurely (awk itself loops)
Array detail Usage:
Associative array: array[index-element]
You can use any string and enclose the string in double quotation marks.
If the array does not exist beforehand, awk automatically creates this array at the time of the reference and assigns the value to NULL
To determine if an element exists in the array, use the "index in array" to iterate over the array lookup
The array index of AWK is numbered starting from 1
Function:
① Numerical Processing: Srand (): Returns a random number between 0 and 1
② string Processing:
Length ([s]): Returns the length of the specified string
Sub (matched character, replacement character, [$]) search substitution, replace only once
Gsub (matching character, substitution character, [file]) search substitution, global substitution
Split (S,array,[r]) takes R as a delimiter, cuts the string s, and saves the cut result to the array represented by array, the first index value is 1, the second index value is 2,...
③ a custom Function:
Syntax format: parameters are defined first to invoke the same as C language
Function name (argument list, name,pwd) {
function body
return value
}
Calling the shell command in awk
System command
The space is a string connector in awk, and if you need to use a variable in awk in system, you can use the
Space-delimited, or other than Awk's variables, are all "referenced."
Passing parameters to an awk script
-V passing parameters to the awk script
The awk usage of shell programming