Awk learning and usage.

Source: Internet
Author: User

Tag: the logic is useful for the first execution of 3.3 processing | WHILE LOOP

Awk learning and usage

Awk is a text processing language with powerful functions and flexible usage. It can also process operations that cannot be completed by cut. The following is a summary of my usage by combining the information on the network and some examples of my own practices.

I. Definitions of common parameters and options
$0 indicates the number of NF fields in the first field of each row in the current row $1 variable Nr indicates the number of records in each row. The increasing FNR of multiple file records is similar to that of NR, but not the increasing of multiple file records, each file starts from 1. \ t tab \ n line break FS begin defines the record delimiter of the delimiter Rs input. The default value is Line Break (that is, text is input by one line )~ Match, not exact comparison with = !~ Mismatch. exact comparison = all. exact comparison! = Not equal, exact Comparison & logical and | 1 or more/[0-9] [0-9] +/two or more numbers /[ 0-9] [0-9] */one or more numeric filename file names OFS output field delimiters, it is also a space by default, and can be changed to the record separator output by the ORs, such as tabs. The default is a line break, that is, the processing result is also a line output to the screen-F' [: #/] 'defines three delimiters

Note: NF, NR, FS, and RS are built-in variables of awk and can be used to control the output.

Ii. Examples

My test text:
[Email protected]:/tmp # Cat test.txt
ABC BB CC
A1 B1 C1
A2 B2 C2
12 20 30

1. Execution method:

Awk can be executed through command lines, but can also be run by writing scripts.

Command line:
[email protected]-unknown85880:/tmp# awk ‘{print $0}‘ test.txt abc   bb   cca1    b1   c1
-F specifies the program body:
[email protected]-unknown85880:/tmp# awk -f awk-test.awk hello world

Program content:

[email protected]-unknown85880:/tmp# cat awk-test.awk BEGIN{print "hello world"}
Script Execution

Awk-F awk script file name processed file name

2. awk control over output fields/columns

There are many columns in the text. We need to implement the following output for some columns:

All outputs:

[email protected]-unknown85880:/tmp# awk ‘{print $0 }‘ test.txt abc   bb   cca1    b1   c1a2    b2   c212    20   30

Output the first two columns: NF Parameter

[email protected]-unknown85880:/tmp# awk ‘{NF=2;print $0 }‘ test.txt abc bba1 b1a2 b212 20

The output specifies a column. For example, in the second column, you can change 0 to 2. The output last column is $ NF.

3. delimiter Control

-F is the delimiter used for text processing. You can specify a new delimiter. The default Delimiter is space:

[email protected]-unknown85880:/tmp# awk -F‘a‘ ‘{print $1,$2}‘ test.txt  bc   bb   cc 1    b1   c1 2    b2   c212    20   30

-Fs can be like this, but it also defines the separator, but unlike-F, it will not hide the separator:

[email protected]-unknown85880:/tmp# awk  ‘{FS="a";print $1,$2}‘ test.txtabc bb 1    b1   c1 2    b2   c212    20   30

Specify the separator to 2: it will not be hidden here, and the effect is the same as that of-f.

[email protected]-unknown85880:/tmp# echo "111|24252|333"|awk ‘BEGIN{FS="2"}{print $1,$2}‘111| 4
4. Operation supported

The original content is as follows:

[email protected]-unknown85880:/tmp# cat test2.txt 12  13  150   -1  20

Examples of summation and averaging:

[email protected]-unknown85880:/tmp# awk ‘{sum=$1+$2+$3;avg=sum/3;print $1,avg,sum}‘ test2.txt 12 13.3333 400 6.33333 19

Judge matching output: judge whether the first column is greater than 0. Only outputs if the first column is greater than 0.

[email protected]-unknown85880:/tmp# awk ‘$1>0{print $0}‘ test2.txt 12  13  15

Number of statistical characters: counts the length of each line and outputs

[email protected]-unknown85880:/tmp# awk ‘{print length}‘ test2.txt 1010a

Note: The delimiter length is also counted.

5. Pattern Matching

Exact match: for example, matching the 1st column as A1

[email protected]-unknown85880:/tmp# awk ‘$1=="a1"{print $0}‘  test.txt a1    b1   c1

Fuzzy match: match the first column containing:

[email protected]-unknown85880:/tmp# awk ‘$1 ~"a"{print $0}‘  test.txt abc   bb   cca1    b1   c1a2    b2   c2

Multi-condition match: match the first column that contains A1 and the second column that contains B1:

[email protected]-unknown85880:/tmp# awk ‘/a1/ && /b1/‘  test.txt a1    b1   c1

Note & |! The use of the three conditions is not described here.

6. Various if judgments and for and while loops are supported.

Awk is a language. If and for are one of the most basic syntaxes. Here is a simple example:
If: determines whether the second domain is equal to 13, which is equivalent to the exact match described above.

[email protected]-unknown85880:/tmp# awk ‘{if($2 =="13")print $0}‘ test2.txt 12  13  15

For: Let all rows be output cyclically three times:

[email protected]-unknown85880:/tmp# awk ‘{for(i=1;i<=3;i++)print $0}‘ test2.txt 12  13  1512  13  1512  13  150   -1  200   -1  200   -1  20

While is equivalent to the for syntax.

[email protected]-unknown85880:/tmp# awk ‘{i=1;while(i<=3){print $0;i++}}‘ test2.txt 12  13  1512  13  1512  13  150   -1  200   -1  200   -1  20
7. Use of Nr:

The NR parameter is one of the built-in variables of the awk. During matching, 1 is added row by row.
Delete an even number of rows:

[email protected]-unknown85880:/tmp# awk ‘NR%2!=0{print $0}‘ test.txt abc   bb   cca2    b2   c2

Number of output lines:

[email protected]-unknown85880:/tmp# awk ‘{print NR ,$0}‘ test.txt 1 abc   bb   cc2 a1    b1   c13 a2    b2   c24 12    20   30
8. Use of RS:

Specifies the line separator as "|"

[email protected]-unknown85880:/tmp# echo "111 222|333 444|555 666"|awk ‘BEGIN{RS="|"}{print $0}‘111 222333 444555 666

Note: Each line is separated by one \ n. The effect of specifying the separator "|" here is that when "|" is met, it is replaced with \ n.

9. Use of ORS:

Separate "\ n" by default and replace it with "*":

[email protected]-unknown85880:/tmp# awk ‘BEGIN{ORS="*"}{print $0}‘ test.txt abc   bb   cc*a1    b1   c1*a2    b2   c2*12    20   30*[email protected]:/tmp# 

Ors is easy to mix with Rs. In terms of effect, RS specifies a separator and replaces it with the default ors value "\ n", that is, press Enter, when the ORs is re-specified, the default "\ n" is replaced with another one, for exampleOr "|", when the text encounters a line break, it is not a line break, butAnd then output.

10. OFS usage:

OFS also specifies the delimiter, but the effect is different from that of FS. OFS will replace the delimiter. For example, if it is a space Separator by default, it will be replaced with "B, the space is gone, and the space is changed to B:

[email protected]-unknown85880:/tmp# awk  ‘BEGIN{OFS="b"}{print $1,$2,$3}‘ test2.txt 12b13b150b-1b20

Awk learning and usage.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.