Format output text using awk

Source: Internet
Author: User

Note: This article is not an introduction to awk, but rather an example, as awk uses C syntax, so many areas of awk retain traces of C, such as printf statements; FOR,IF syntax structure, etc.

Introduced

In the simplest sense, AWK is a programming language tool for working with text that executes a series of instructions as long as there is a pattern match in the input data. The awk command format is:

awk {Pattern + action} {Filenames}
Awk can read the following file, or it can read the standard input from the previous command, scanning each line of the input data separately, looking for pattern matching on the command line. If it matches, then the action action is followed. If the pattern does not match or the action part is finished, continue processing the next line until the end

Awk tends to divide a line into several fields to deal with the processing of an entire line compared to SED. awk treats the input data as a text database, which, like a database, also has the concept of records and fields. By default, the delimiter for a record is carriage return, the delimiter for the field is a blank character (space, \ t), so each row of the input data represents a record, and the contents of each row are separated into multiple fields by whitespace. With fields and records, AWK has a very flexible way to work with files

Syntax 1 syntax

A typical awk syntax is as follows:

awk '       BEGIN{STAT1}      
Pattern1{action1} pattern2{action2} ... Patternn{actionn} {default action, unconditional, always execute} END{STAT1} end{stat2}'
Where begin is the operation before the text, typically used to change fs,ofs,rs,ors, and so on, when the Begin section is completed, awk reads the first line of input and fills the first row of data into variables such as $0,$1,$2,NR,NF, and then enters the formal processing stage, after all rows have been processed , go to the end section, end is generally used to summarize, print reports, and so on. Formal processing is a built-in cycle, each time the loop reads a row of data, each line of processing is divided into multi-mode, multi-action, the text line meets the conditions pattern1 on the execution of the action Action1 in line with the pattern2 on the execution of the action action2 ..., can also have the default action, That is, the action within this {} is always executed without the pattern judgment. Begin,end part is not must appear, can not, also can have any multiple pattern part of the notation is:
    • /reg/: Match reg across line range, match to follow action
    • ! /reg/: The entire line does not match to Reg to perform subsequent actions
    • $ ~/reg/: Match reg only in the first field
    • $!~/reg/: Mismatched
    • nr>=2: Start processing from the second line

Pattern, part, and subsequent if,for parts, the symbols that can be used are:

2 built-in variables
$0 the current record (the contents of the entire row in this variable) $1~$n the nth field of the current record, between the fields by the FS-delimited FS input field delimiter by default is a space or \tnf the number of fields in the current record. is how many columns of NR have been read out of the number of records, is the line number, starting from 1, if there are multiple files, this value is constantly accumulating. FNR the current record count, unlike NR, this value will be the individual file's own line number RS input record delimiter, the default is the newline character ofs output field delimiter, the default is a space ORS output of the record delimiter, default is the newline character filename the current input file name
3 if,for Statements

You can have multiple side-by-side actions within the #在任何时候 {} (using ";" Delimited), the following {Action1} and {action1;action2; ...} all indicate that there are multiple actions in the {} body, two representations are not any different, and the second is simply for intuitive representation that there can be multiple actions

#for循环写法

 for (i=1; i<=nf;i++) {action1; action2;.} #{} separate multiple actions with semicolons  for (i=1; i<=nf;i++) if Else if; Else #for后接一个if结构  for (i=1; i<=nf;i++) printf " for Add" #简单的循环打印

#if to judge the wording

if ($1Elseif($1else#else If part can have no if($1 ~/ reg/&& $2 ~/reg2/) {action} #多个条件用 "&&", "| | "represents if ($15) {Action

# if,for Mixed notation

{ for(i=1; i<=nf;i++)if(...) printf "Test";Else if(...) printf "test2";Elseprintf "test3"; Print"not_for" }
#print the "not_for" section is another action that is tied to the for loop structure and will print only once for the for loop { for(i=1; i<=nf;i++) {if(...) printf "Test";Else if(...) printf "test2";Elseprintf "test3"; print "In_for"}; Print"not_in_for" }{ for(i=1; i<=nf;i++) {if{s1;s2;}Else if{s3;s4;}Else{s5;s6;}; Print"Test"} }#elseIf before the semicolon is not added{ for(i=1; i<=nf;i++) printf"For_add";if(...);Else if(...);Else} #if并不在for循环体内
The For loop is scoped to:
    • The if that follows immediately thereafter; else if; Else statement
    • Multiple actions in {} that follow immediately thereafter
    • followed by a first normal action

The scope of the IF statement:

    • The first action immediately after the IF
    • Multiple actions in {} Immediately after the IF
4 awk Tips

1:awk uses the RE for Ere

2: If OFS is set in begin, only OFS will be effective

The difference between 3:printf and print: printf does not automatically print line breaks, and print prints automatically

The return value of 4:gsub is not the replaced string, but the number of times the substitution is returned

5: String constants must be surrounded by "", otherwise used as variables, such as $1== "IPAddress"

The For loop of the 6:awk is C-style, which is for (), which is different from the for-I in ... in the shell.

You can use multiple separators in 7:awk, enclose them in square brackets, and surround them with ' ' to prevent them from being interpreted by the shell, such as Awk-f ' [:/t] ', using spaces, colons, tab as separators

8:next statement: Take the next input line from the input file and re-execute the command at the top of the awk command table, typically to skip some special rows

9:awk matches multiple conditions: awk '/kobe/&&/james/' #匹配同时有kobe和james的行

The default value for 10:fs is [/t/n]+, the default value for OFS is a space, and the default value for Rs,ors is newline.

11: There are two ways to locate a row: 1:nr== line number 2: With re/love$/

12:exit statement: Terminates the AWK program, but does not skip the end statement

13:$1: $n represents the first few columns (fields), and the whole row is represented by $ A.

14:awk Available comparison operators:! =,;, <, >=, <=

: "awk ' $6 ~/fin/{print $6} '" ~ Indicates the start of the pattern, and the regular expression matching pattern in/reg/

16: String match: ~: Match!~: mismatch

: &&: Multiple conditions and, | | Multiple conditions or

: {s1;s2;s3; ...} Multiple statements are separated by semicolons; else if; Else

awk instances
awk ' /al/{printf $; print $} ' Emp.txt awk ' /al/{print $ ' {print $} ' emp.txt

#第一种只处理匹配到AL的行; Then print the first and second fields of these lines

#第二种只有在匹配到AL的行才打印字段一, but field two is unconditional and always prints

Format output text using awk

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.