Deep application of awk

Source: Internet
Author: User

Awk:

Awk can be said to be an independent language that provides powerful functions for text processing, and can process various processing results. Now we will summarize it.

Awk: Report Generator

Basic syntax

Awk [Options] 'signature' file...

Awk [Options] 'pattern' {action} 'file file...

-F char: Input Separator

-F char: Input Separator

1. awk output

 

Print Item1, item2 ,...

Key points:

(1) projects are separated by commas, while outputs are separated by output delimiters;

[[Email protected] _ 6 data] # awk-F: '{print $1, $7}'/etc/passwd => As shown in ': 'separate the rows in/etc/passwd and take the 1st and 7th columns for output.

(2) Each output item can be a string or value, a field in the current record, a variable, or an awk expression. The value is implicitly converted to a string and then output;

(3) If the item after print is omitted, it is equivalent to print $0. If the output is blank, use pirnt "";

[[Email protected] _ 6 data] # awk-F: '{print}'/etc/passwd

2. awk Variables

Built-in variables and custom Variables

2.1 built-in Variables

FS: Field seperator, the field separator when entering

Awk 'in in {FS = ":"} {print $1, $7} '/etc/passwd => of course, this writing method is really inferior to awk-F: '{print $1, $7}'/etc/passwd

RS: Record seperator, output line Separator

OFS: output field seperator, which is the field separator for output;

[[Email protected] _ 6 data] # awk 'in in {FS = ":"; OFS = "*"} {print $1, $7} '/etc/passwd. For example, if ":" is specified as the separator during input and "*" is used as the separator during output

ORS: outpput row seperator, which is the line separator for output; => the line separator is generally \ n by default and rarely used. This is not demonstrated here.

NF: numbers of field, number of fields

[[Email protected] _ 6 data] # awk-F: '{print NF}'/etc/passwd => View the file. After each row is separated, number of Columns

Nr: Numbers of record, number of rows; all files are counted together;

[[Email protected] _ 6 data] # awk-F: '/root/{print Nr}'/etc/passwd => View the row containing the root string

FNR: the number of rows. Each file is counted separately;

Copy the/etc/passwd and/etc/group files to/tmp,

[[Email protected] _ 6 TMP] # awk-F: '{print NR, FNR, $0}' passwd group =>, the two files are merged to display each row, the row number, and the row number of each row in its own file.

Argv: array, saving the command itself. awk '{print $0}' 1.txt 2.txt, meaning argv [0] saves awk, argv [1] is 1.txt, argv [2] is 2.txt

Argc: number of parameters in the awk command;

Filename: name of the current file being processed by awk;

[[Email protected] _ 6 TMP] # awk-F: '/root/{print argv [0], argv [1], argv [2], argc, filename} 'passwd group => View the array generated by this command and the files being processed
Awk passwd group 3 passwd
Awk passwd group 3 passwd
Awk passwd group 3 group

2.2 customizable Variables

-V var_name = Value

Of course, we can also define variables in the begin mode or in the action

Variable names are case sensitive;

(1) variables can be defined in program;

(2) You can use the-V option to customize variables in the command line;

Define in begin

[[Email protected] _ 6 TMP] # awk 'in in {wh = "how are you"; FS = ":"}/root/{print Wh, $1} 'passwd
How are you root
How are you Operator

Define in Command Options

[[Email protected] _ 6 TMP] # awk-V wh = "hi" 'In in {FS = ":"}/root/{print Wh, $1} 'passwd

3. awk's printf command => This is enough ......

Command Format: printf format, Item1, item2 ,...

Key points:

(1) specify format;

(2) does not automatically wrap the line. If you need to wrap the line, \ n must be given.

(3) format is used to specify the output format for each item;

The format indicators start with %, followed by a character:

% C: the ASCII code of the character;

% D, % I: decimal integer;

% E, % E: Numeric value displayed in scientific notation;

% F: Display floating point; => the default value is 6-bit precision.

% G, % G: numerical value is displayed in scientific notation or floating-point number format;

% S: Display string;

% U: displays unsigned integers;

%: Display % itself;

Modifier:

#: Display width

-: Left alignment

+: Displays the numeric symbol.

. #: Value precision

Display result. Each column occupies 10 characters and is displayed on the left alignment.

[[Email protected] _ 6 TMP] # awk-V wh = "hi" 'In in {FS = ": "}/root/{printf" %-10 S %-10s \ n ", wh, $1} 'passwd
Hi Root
Hi Operator

[[Email protected] _ 6 TMP] # awk 'in in {printf "% 15.3f \ n", 99999} '=> it is worth noting that after length modification, Add. # indicates the exact length of the value to be output.

4. awk output redirection

Print items> output-File

Print items> output-File

Print items | command

Special file descriptor:

/Dev/stdin: Standard Input

/Dev/stdout: standard output

/Dev/stderr: Error output

==> The above is unlikely to be used.

5. awk Operators

Arithmetic Operators:

X + Y

X-y

X * y

X/y

X ** y, x ^ y

X % Y

-X: negative value

+ X: convert to a value

String OPERATOR: Join

Value assignment operator:

=

+ =

-=

* =

/=

% =

^ =

** =

++

--

If the mode itself is =, write it as/=/

Comparison operator:

<

<=

>

> =

=

! =

~ : Pattern match. the string on the Left can be true by the pattern on the right; otherwise, it is false;

!~ :

Logical operators:

&: Corresponds

|: Or

Conditional expression:

Selector? If-True-expression: If-False-expression

# Awk-F: '{$3 >= 500? Utype = "common user": utype = "Admin or system user"; print $1, "is", utype} '/etc/passwd

Function call:

Function_name (argu1, argu2)

6. Mode

(1) Regexp: Format:/pattern/

Only the rows matched by/pattern/are processed;

[[Email protected] _ 6 TMP] # awk-F: '/root/{print $1, $7}' passwd => for example, match a row containing a root string
Root/bin/bash
Operator/sbin/nologin

(2) expression: expression. If the result is not 0 or a non-null string, the condition is met;

Only the rows that meet the conditions are processed;

[[Email protected] _ 6 TMP] # awk-F: '$3> 500 {print $1, $7}' passwd
Robert/bin/bash
Gentoo/bin/bash
Nginx/bin/bash

(3) ranges

Only process rows in the specified range

[[Email protected] _ 6 TMP] # awk-F: 'nr> 25 {printf "%-15 S % 15 S % 15s \ n", $1, $7, NR} 'passwd ==> display rows with a row number greater than 25
L server 26
ROB/bin/bash 27
Robert/bin/bash 28
Gentoo/bin/bash 29
Nginx/bin/bash 30

(4) begin/end: special mode. It is executed only once before or after the awk Command's program is run;

(5) Empty: NULL mode, matching any row;

7. common actions

(1) Expressions

(2) control statements

(3) Compound statements

(4) Input statements

(5) Output statements

8. control statements

8.1 if-Else

Format: If (condition) {then body} else {else body}

# Awk-F: '{if ($3 & gt; = 500) {print $1, "is a common user"} else {print $1, "is an admin or system user"} '/etc/passwd

# Awk '{If (NF> = 8) {print}'/etc/inittab

8.2 while

Format: While (condition) {while body}

# Awk '{I = 1; while (I <= NF) {printf "% s", $ I; I + = 2}; print "}'/etc/inittab

# Awk '{I = 1; while (I <= NF) {If (length ($ I) >=6) {print $ I }; I ++} '/etc/inittab

Length () function: obtains the length of a string.

8.3 do-while loop

Format: do {do-while body} while (condition)

8.4 For Loop

Format: For (variable assignment; condition; iteration process) {for body}

# Awk '{for (I = 1; I <= NF; I + = 2) {printf "% s", $ I }; print ""} '/etc/inittab => different columns in the same row do not want to wrap the output during loop. Use this printf "% s"

# Awk '{for (I = 1; I <= NF; I ++) {If (length ($ I)> = 6) print $ I} '/etc/inittab

Separate the values in/etc/passwd with: to print the odd series

Awk 'in in {FS = ":"} {If (NF> 5) {lengths = Split ($0, line, ":"); For (IB = 0; IB <= lengths; IB ++) {If (IB % 2) {printf "% s", line [IB]}; print Nr "\ n"} '/etc/passwd


The for loop can be used to traverse array elements:

Syntax: for (I in array) {for body}

8.5 case statement

Syntax: Switch (expression) {Case value or/rgeexp/: statement1;... Default: stementn}

8.6 Loop Control

Break

Continue

8.7 next

The processing of this row is terminated before entering the next line; that is, it is more aggressive than brealk and continue to enter the next line.

# Awk-F: '{if ($ 3% 2 = 0) next; print $1, $3} '/etc/passwd => If the ID number is an odd number, the user name and ID are displayed.

# Awk-F: '{If (NR % 2 = 0) next; print NR, $1}'/etc/passwd

9. Array

Join array:

Array [index-expression]

Index-expression: any string can be used. If an array element does not exist in advance, awk will automatically create this element and initialize it as an empty string during reference. Therefore, to determine whether an array has an element, you must use the "index in array" format;

A [first] = "Hello awk"

Print a [second]

To traverse every element in the array, use the following special structure:

For (VAR in array) {for body}

Its var will traverse the array index;

State [Listen] ++

State [established] ++

# Netstat-tan | awk '/^ TCP/{++ State [$ NF]} end {for (S in State) {print S, State [s]}'

# Awk '{IP [$1] ++} end {for (I in IP) {print I, IP [I]}'/var/log/httpd/access_log

Delete array elements:

Delete array [Index]

10. built-in functions of awk

Split (string, array [, fieldsep [, SEPs]):

Function: slice a string with fieldsep as the separator, and save the sliced result to an array in the name of array; array subscript starts from 1;

Root: X: 0: 0:/root:/bin/bash

User [1] = "root", user [2]

This function has a return value. The return value is the number of elements after slicing.

# Netstat-Tn | awk '/^ TCP/{lens = Split ($5, client ,":"); IP [client [lens-1] ++} end {for (I in IP) print I, IP [I]}'

[[Email protected] _ 11 ~] # Netstat-TNA | awk '/^ TCP/{stat [$ NF] ++} end {for (I in STAT) {print I, stat [I]} 'view the data volume in different States


[[Email protected] _ 11 httpd] # awk '{arr [$1] ++} end {for (I in ARR) {print I, arr [I]} '/var/log/httpd/access_log => count the number of visits to each IP address of the HTTPd service

Length (string)

Function: returns the length of a specified string.

[[Email protected] _ 11 httpd] # awk 'in in {STR = "this is a very good test! "; OK = length (STR); print OK }'

25

Substr (string, start [, length])

Function: gets a substring from a string, and starts from the start position to the length of the substring;

Truncates the string that starts with the fifth character and ends with the end,

[[Email protected] _ 11 httpd] # awk 'in in {STR = "this is a very good test! "; OK = substr (STR, 5); print OK }'

Is a very good test!

Truncates the string, starts with the fifth character, and the following five strings.

[[Email protected] _ 11 httpd] # awk 'in in {STR = "this is a very good test! "; OK = substr (STR, 5, 5); print OK }'

Is

 


This article is from the "909 is a goal" blog, please be sure to keep this source http://robert1joy.blog.51cto.com/4489523/1544907

Deep application of awk

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.