The awk command for Linux

Source: Internet
Author: User
Tags case statement numeric value

Awk is the third text processor in the grep system and SED system


Gawk-pattern scanning and processing language

Format:

gawk [Options] ' program ' File ...

Program:/pattern/{action statement, ...}

Pattern section: Determines when action statements are triggered and what events to trigger;

Begin,end


ACTION statement: Specific processing of the data, usually placed in {}, and quoted using single quotation marks;

print,printf

For awk, there is a need for the concept of delimiters, because awk distinguishes between the fields of the required action based on the delimiter;


Input delimiter:

When awk processes the data, the data is segmented according to a specific identity symbol, which is called the input delimiter, and the default input delimiter is a space;

Output delimiter:

After the data has been processed by awk, each field is continuously output with a specific identifier symbol. This identification is called the "Output delimiter", and the default output delimiter is a white-space character;


Recording:

A row in the data delimited by a newline character, called a record, usually used to save the contents of the entire record;


Field:

Each fragment of data after delimiter separation is called a field; Typically, when you use awk to process data, you use $, $,. $NF and other built-in variables to store data for each field;


The most complete working mode of awk: begin{action statement}{action statement}end{action Statement}

First, the BEGIN statement block is executed, followed by the subject block, and finally the end statement block; Special note: When the BEGIN statement block executes, the data content is not processed, it is the statement executed before the data is processed, it is generally used for the writing of the table header, and the opposite end statement block is performed after playing data processing. The statements that are executed are generally used for the total number of outputs; the BEGIN and end statement blocks have to be omitted and not written, but the middle block of statements must be written, not omitted;


Common options:

-F Program-file: Loads the program statement block from the specified file, rather than giving the relevant procedural content through the command line;

-F FS: Specifies the input delimiter for the entry and exit fields, and the default is whitespace;

-V Var=val: Used to declare a custom variable and assign a value to the variable;


Common Uses of awk:

1. Variables:

Built-in variables:

Fs:input field separator, enter the fields delimiter, default to white space characters;

ofs:output Field separator, the output fields delimiter, default to white space characters;

Rs:input Record separator, enter the recording (row) separator, default to line break;

Note: If an additional input record delimiter is specified, the original line break is still valid;

Ors:output record separator, output recording delimiter; default is newline character;

Nf:number of fields, the total number of field in each row;

Nr:total number of input records, the sum of rows; If only one file is processed, the value of NR can be used as the line number of each line of this file;

Fnr:the input record number in the current input file, for different files to count their rows, you can also display the line number of each line in each file;

Filename:the name of the current input file, the filename of the files currently being processed;

Argc:the number of command line arguments, the amount of arguments in the custom variable commands row, including the awk command itself, but excluding the options and program sections of the awk command;

Argv:array of command line arguments. An array of all the parameters in the shell;

Custom variables:

-V var_name=value (variable name case sensitive)


Common action:

2.print:print the current record, outputting the result in a standard format;

Format:

Print ITEM1,ITEM2,...

Attention:

1) Each item needs to be separated by the use of ",";

2) The output of each item can be a string, can be a number, can be a field in the current record, can be a variable, can be an expression of awk;

3) If item is omitted, the default item is $ A, which is: Output whole line

3.printf:format and print. Output results in a specific format;

Format:

printf "FORMAT" item1,item2,...

Attention:

1) The appropriate output format must be given:

2) The default does not wrap, if you want to display in the output of the line, you need to display a given line-break control symbol, that is: \ n;

3) format requires a single formatted symbol for each subsequent item.

Commonly used format:

%c: Displays character information in the ASCII code table;

%d,%i: Displays decimal integer format;

%e,%e: Display numbers in scientific notation; floating-point types;

%f,%f: Displays the floating-point form of decimal digits;

%g,%g: It is a floating-point number in the scientific notation;

%u: Displays unsigned decimal numbers;

%s: Display string;

%x,%x: Displays the total number of unsigned hexadecimal numbers;

Percent: shows a%;

Modifier:

#[.#]: The first digit is used to control the display width; the second number indicates the precision of the decimal point;

such as:%5s,%8.3f

-: Indicates a left-aligned display, and the default is right-aligned;

+: Displays the positive and negative sign of the number;

4. Operator:

Arithmetic operator:

Binocular operator:

X+y,x-y,x*y,x/y,x^y, X%y

Monocular operator:

-X: Converts a positive integer to a negative integer;

+x: Converts a string to a numeric value;

String operators:

When there is no action symbol, it is a string connection operation;

Assignment operators:

=,+=,-=,*=,/=,^=,%=

++, --

comparison Operators

==,!=,<, <=, >=

Pattern matching operators:

~: Whether the string to the left of the operator can be matched by the pattern on the right;

!~: Whether the string to the left of the operator cannot be matched by the pattern on the right;

Logical operation operators:

&& | | !

Conditional expression:

Selector (condition)? if-true-expression:if-false-expression

5.PATTERN part:

1) Empty: null mode, processing each line of the file without distinction;

2) [!] /regexp/: Processing only [not] lines that can be matched to the pattern;

3) Relational expression:

$3>=1000

$NF ~/bash/

4) Line field, line range:

Logical operation of relational expressions: fnr>=10&&fnr<==20

/regexp1/,/regexp2/:

From the line that is matched by the REGEXP1, until the end of the line that is matched by the REGEXP2, all rows for that period, and the number of groups to be displayed for any matching results belonging to this class;

5) begin/end mode:

begin{}

a statement block that executes only once before the first line of text data in the file is started, and is used to output header information in a particular format;

end{}

a statement block that executes only once the text processing is complete but the awk command has not exited, and is used to summarize the data information;



Attention:

1) The Order of the BEGIN statement block, the pattern statement block, and the end statement block, in general terms:

begin{}pattern{}end{}

2) The BEGIN statement block and the end statement block are optional, but the pattern statement block must be given;

6. Control statements:

if (condition) statement [Else statement]

while (condition) statement

Do statement while (condition)

for (EXPR1; expr2; expr3) statement

for (var in array) statement

Break

Continue

Exit [expression]

switch (expression) {case value|regex:statement ... [Default:statement]}

Next


1) If ... else:

Grammar:

if (condition) statement [Else statement]

Usage scenario: Make a conditional judgment on the entire row or field obtained by awk;

Analyze the space utilization of individual file systems on disk:

2) While loop:

Grammar:

while (condition) statement

Usage scenarios:

A. Use the same or similar operations for multiple fields within a row;

B. The array elements in the arrays are used for traversal processing;


The characteristics of the while loop: the condition is true, enters the loop, and once the condition is false, exits the loop;


3) do ... while statement:

Grammar:

Do statement while (condition)


Meaning: Same as while loop, but statement statement segment is executed at least once;


4) for loop:

Grammar:

for (EXPR1; expr2; expr3) statement

expr1:variable assignment, variable assignment initial value;

Expr2:circle condition, cyclic condition judgment;

Expr3:interation process, variable value correction method;

for (var in array) statement

5) Switch ... case statement

Grammar:

switch (expression) {case Value|regex:statement;case value2|regex2:statement; .... [Default:statement]}


Usage scenarios:

Used for string comparison judgment;


6) Break and Continue statements:

Break [n]

Continue


Note: Its usage scenario is the loop control mode between multiple fields in the row;


7) Next statement:

When awk processes the data, it ends the processing of the current row prematurely and starts processing the next line directly;

7. Array--array

User-defined arrays, typically using associative arrays: array_name[index_expression]

Attention:

1) index_expression can use any string, but the string must be enclosed in double quotation marks;

2) support for weakly variable arrays, that is, if an array element does not exist beforehand, when the element is referenced, awk automatically creates this element and assigns an "empty string" as its initial value to this element

9. Functions:

Built-in functions:

Numeric functions: Numeric Functions

RAND (): Returns a random number between 0 and 1;

sqrt (): for the specified value to open two Times Square;


String function: String Functions

Length (): Calculates the lengths of the given string;

Gsub (R, S [, T]): Finds the contents of the string represented by T in the pattern represented by R, and replaces all occurrences with the content represented by S;

Split (S, a [, R [, SEPs]]): Use SEPs as a delimiter, match by the pattern of R representation, and divide the string represented by S into an array of a representation;


Custom functions:

Function name (parameter list) {statements}





The awk command for Linux

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.