AWK usage Summary

Last Update:2014-11-21 Source: Internet

Author: User

Tags print format

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

AWK usage Summary

1. The most basic function of awk is to match the specified string format in the input file by row. If it matches, the current row is copied to the buffer for further processing, but it does not change the input file itself. After an awk matches each row (called Record), it is automatically divided into several segments (called Field) using the default or specified delimiter. Each segment can be referenced and accessed using $ number. Where field 0 corresponds to the content of the entire Record. After each row is processed, read the next row for processing.

The common format of the awk command is

Awk '/pattern/{action}
/Pattern1/{action1} 'datafile1 datafile2

Awk can specify multiple matching conditions and processing actions (Rules) at the same time. Each rule can be separated by a new line, as long as they are included in the same single quotation mark. Awk can process multiple file objects at the same time.

If {action} is omitted, awk prints all rows matching the pattern by default. If you want to specify not to do anything, you can omit action and keep only {}.

-------------------------------------- Split line --------------------------------------

Introduction and use of AWK

AWK introduction and Examples

Shell script-AWK text editor syntax

Learning and using AWK in Regular Expressions

AWK diagram of Text Data Processing

How to Use the awk command in Linux

-------------------------------------- Split line --------------------------------------

2. There are three methods to run the awk program.

Awk 'project' datafile1 datafile2...

Here, the program is the combination of pattern and action, as described above.

Awk-f program-file datafile1 datafile2...

It is applicable to complicated procedures. Put/pattern/{action} in the program-file (No single quotation marks required) before calling. To make the program file clear, you can use the suffix. awk.

Write the awk program as a script file and run it.

Add #! At the beginning of the script file line #! /Bin/awk-f, and then the BEGIN {} block specifies the prerequisite for the execution of the main program, the middle is the main program, and finally the END {} block specifies the END operation. Run in Script Mode and specify the input file later.

Note that awk statements are different in awk scripts and shell scripts. The former uses the syntax rules of the awk program, and the latter only writes the first two usages to the file for execution, follow the syntax rules of shell scripts.

3. When reading a file, you can specify other Record Separator in addition to reading the Record by row by default.

Awk 'in in {RS = "/"}; {print $0} 'datafile

Use the awk built-in variable RS to specify the new Record delimiter /. When an awk is printed, a new line is restarted after each Record. by specifying a new delimiter, a new line break is added to the datafile. In particular, if the RS parameter is specified as a null character, an empty behavior Delimiter is used. Awk uses the built-in variable FNR to track the currently read row number (Record Number ). FNR is automatically set to 0 each time a new file is read. The NR variable records the total number of lines (number of records) read by the current awk command. The variable starts from 0, but it does not reset the value even if it reads a new file. If RS is set again in the main program, only the content read after the setting is affected.

4. for each row (record) read, awk divides it into several fields by default with (one or more) spaces/tabs as segments ), each segment can be referenced and accessed using $ number, where number can be a variable or expression that can calculate a value. $0 corresponds to the entire record, and $ NF corresponds to the last segment. If the record number is greater than the total number of segments, an empty character is returned, but you can assign a value to it and print it. Note that the NF without $ corresponds to a built-in variable, and its value is the number of segments in the current record. For example:

Awk '$1 ~ /Pattern/{print $ !, $ NF} 'datafile

Awk '$1 ~ /Pattern/{print $ !, $ NF} 'in the example of datafile, print the first and last segments of all rows in datafile that contain pattern .~ Is a matching operator that checks whether a given string ($1) matches a regular expression.
Apart from the default delimiter, awk allows you to specify a character or regular expression as the delimiter. The delimiter cannot appear in the segmentation. Awk uses the built-in variable FS to specify the segment delimiter.

Awk 'in in {FS = ":"}; {print $0} 'when you use the command line to specify the delimiter, you can use the-F parameter.

Awk-F, '/pattern/{action}' specifies "," as the segment Separator in the example of datafile.

5. When printing multiple segments using print, blank space is used as the delimiter by default, but other separators can also be explicitly specified using the built-in variable OFS (output fieldseparator. Similarly, you can specify strings other than the default line break between multiple records.

Awk 'in in {OFS = ";"; ORS = "\ n"; OFMT = "% d "}
{Print $1, $2} 'Apart from OFS, the ORS in this example is the delimiter between records.

OFMT is the print format when processing numbers. But in fact, we usually use printf to format the printed content. It can specify the width of each segment and the print format (base number, exponential form, etc.) of numbers ). The usage is as follows:

Awk '{print format item1, item2 ...} '

Awk '{print format item1, item2 ...} 'Unlike print, printf does not automatically add a line feed after printing a record, and OFS and ORS will lose its role. The format method is the same as that of the printf function in C.

Awk '{printf "%-10 s % s \ n", $1, $2}' datafile

Awk '{printf "%-10 s % s \ n", $1, $2}' in this example, print segment 1 as a string with a minimum width of 10 characters. If it is not enough, fill in the blank space on the left (if it is not-, fill in the blank space on the right ). And then print Segment 2 with spaces. Insert a line break to continue processing the next record.
In addition, you can use the >>and | operators to redirect print and printf output. For example:

Awk '{print $1> "output_file_name"
Printf format $2 | "sort-r> output_file_name_1"} 'datafile

For more details, please continue to read the highlights on the next page:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More