Format processor-awk

Source: Internet
Author: User

Introduction:

Grep, sed, and awk are three mainstream text processors, but they have their own advantages and disadvantages in processing. Here we will only introduce awk

Awk is an excellent text processing tool. It is not only one of the most powerful Data Processing engines in Linux, but also in any environment. The maximum functionality of this programming and data operation language (its name is derived from the first letter of its founder, Alfred Aho, Peter Weinberger, and Brian kernighan) depends on a person's knowledge. Awk provides extremely powerful functions: style loading, flow control, mathematical operators, process control statements, and even built-in variables and functions. It has almost all the exquisite features of a complete language. In fact, awk does have its own language: awk programming language. The three creators have formally defined it as "style scanning and processing language ". It allows you to create short programs that read input files, Sort data, process data, perform calculations on input, and generate reports. There are countless other functions.

I. Basic syntax

Awk [Options] 'signature' file ..

Awk [Options] 'pattern' {action} 'file file...

Output Format: Print Item1, item2...

(1) Each item is separated by commas (,), leading to spaces. If displayed with spaces, no space is required.

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899581K86e.png "" 527 "Height =" 159 "/>

(2) Each output item can be a string, the current record field, variable or awk expression; the value is implicitly converted to a string and then output

(3) If the item after the print is omitted, it is equivalent to print $0: Output all

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899581SCbg.png "" 544 "Height =" 72 "/>

========================================================== ======================================

Ii. awk Variables

Built-in variables and custom Variables

(1) built-in Variables

FS: filed seperator Input Field Segmentation Fu
RS: Input line delimiter
OFS Field delimiter for output
ORS Line delimiter for output
NF Number of fields
NR The number of rows. All files are counted together.
FNR Number of rows, counting each file
Argv Array to save the characters of the command itself
Argc Number of awk command parameters saved
Filename Name of the current file that awk is processing

The following example shows the differences:

(A) use different segments

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899581iiW8.png "" 653 "Height =" 134 "/>

(B) counting the number of file lines

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899582Urro.png "" 661 "Height =" 207 "/>

(C) Count each file separately

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899582MR4o.png "" 669 "Height =" 98 "/>

(D) Use arrays to save the characters of command skills

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_140889958259bX.png "" 658 "Height =" 168 "/>

(E) Number of command parameters to be saved

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899582zQOy.png "" 637 "Height =" 75 "/>

(F) display the name of the file system that awk is currently processing

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899582gfOJ.png "" 535 "Height =" 159 "/>

========================================================== ======================================

(2) custom Variables

-V var_name = value // variable names are case sensitive

A. It can be defined in program.

B. You can use the-V option in the command line to customize

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899583O3NF.png "" 671 "Height =" 107 "/>

Iii. Usage of awk printf

Format: printf format, Item1, item2 ....

To specify format

Will not automatically wrap, use \ n for line feed

Format is used to specify the output format for each item

(1) The Format Indicator starts with %, followed by a character

% C ASCII code for displaying characters
% D, % I Decimal integer
% E Scientific notation
% F Show floating point number
% G Display numeric values in scientific notation or floating point Number Format
% S Display string
% U Show unsigned integers
% Display % itself

Convert character to ASCII code

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899583SnXb.png "" 506 "Height =" 115 "/>

Print as decimal

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899583HiDj.png "" 547 "Height =" 94 "/>

Converting a certain number to a scientific counting format

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899583dHzo.png "" 570 "Height =" 82 "/>

Print string

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899583aGlz.png "" 545 "Height =" 108 "/>

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899583WEjO.png "" 557 "Height =" 78 "/>

Count by floating point number

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899584SJmg.png "" 558 "Height =" 82 "/>

The rest will not be demonstrated one by one.

========================================================== ==================

(2) awk output redirection

>,>>,|

Special file descriptor:

/Dev/stdin: Standard Input

/Dev/stdout: standard output

/Dev/stderr: Error output

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899584Thll.png "" 611 "Height =" 90 "/>

========================================================== ======================

(3) awk Operators

Arithmetic Operators: +,-, *, **, %, positive and negative values: + X,-x

String OPERATOR: Join

Value assignment operator: =, + =,-=, * =,/=, % =, ^ =, ** =, ++ ,--

If the mode itself is an equal sign, write it as/=/

Comparison OPERATOR: <=>==! *~ !~

Logical OPERATOR: & |

 

(4) conditional expressions

Selector? If-ture-ezpression: If-False-expression

Condition? True output content: false output content

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899584kgWt.png "" 696 "Height =" 79 "/>

 

Iv. function call

Function_name (argu1, argy2)

(1) Regexp: format/pattern // only process the matched/pattern rows

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899584bSU5.png "" 640 "Height =" 99 "/>

(2) expression: expression. If the result is not 0 or a non-null string, the following conditions are met:

Only the rows that meet the conditions are processed.

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899584Ey02.png "" 629 "Height =" 137 "/>

(3) ranges

Only process rows in the specified range

(4) begin/end: special mode. It is executed only once before or after running the program of the awk command.

(5) Empty: NULL mode, matching all rows in the file

 

5. control statements

(1) Format: If (condition) {then body} [{else body}]

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899584TU89.png "" 572 "Height =" 108 "/>

(2) Format: While (condition) {while body}

# [Email protected] ~] # Awk '{I = 1; while (I >= NF) {if ($ I >= 20000) Print $ I; I ++} 'test.txt

 

(3) Format: do {do-while body} while (condition)

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899585TrrJ.png "" 602 "Height =" 71 "/>

(4) Format: For (variable assignment; condition; iteration process) {for body}

# Awk-F: '{for (I = 1; I <= 3; I ++) {If (length ($ I)> = 8) {print $ I }}'/etc/passwd

* ** The for loop can be used to traverse array elements. Syntax: for (I in array) {for body}

# Awk-F: '$ NF !~ /^ $/{Bash [$ NF] ++} end {for (a in bash) {printf "%-15 s: % I \ n",, bash [a]} '/etc/passwd

 

(5) Switch (expression) {Case value or/rgeexp/: statement1;... Default: statementn}

(6) Break continue Loop Control

(7) Next: end the processing annotation of the bank before entering the next line.

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_140889958556W4.png "" 652 "Height =" 128 "/>

Vi. Application of awk to Arrays

Join Array

Array [index-expression]

Index-expression: any string can be used. If an array element does not exist before the current event is referenced, awk will automatically create this element and initialize it as an empty string, therefore, to determine whether the array has this element, you must use the index in array format.

A [first] = "Hello awk"

Print a [second]

To traverse each element in the array, use the following special structure:

For (VAR in array) (for body)

VaR traverses the index of the array.

State [Listen] ++

State [Listen] ++

650) This. width = 650; "Title =" image "style =" border-top: 0px; border-Right: 0px; Background-image: none; border-bottom: 0px; padding-top: 0px; padding-left: 0px; border-left: 0px; padding-right: 0px "border =" 0 "alt =" image "src =" http://img1.51cto.com/attachment/201408/24/8371039_1408899585TgyV.png "" 553 "Height =" 306 "/>

Delete array elements: You need to delete the index of the array from the relational array.

Delete array [Index]

 

VII. built-in functions of awk

1. Split (string, array [, fieldsep [, SEPs]);

Function: Slice string with fielfsep as the delimiter and save the result in the array.

2. ength ([String])

Function: returns the number of characters in a string;

Substr (string, start [, length])

Function: Take the substrings in the string string, starting from start and taking the length; Start starts counting from 1;

Tolower (s)

3. Can: Convert all letters in S to lowercase letters

Toupper (s)

Function: converts all letters in S into uppercase letters.

Format processor-awk

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.