The text of the Three Musketeers on Linux

Source: Internet
Author: User
Tags arithmetic arithmetic operators logical operators line editor

awk Introduction

Linux Text Processing Tools Three musketeers: grep, sed, awk. Where grep is a text filtering tool; The SED is a text line editor, and Awk is a report Builder that formats the file, but instead of formatting the file system, the formatting is a variety of "typography" of the contents of the file, which in turn formats the display On Linux we use the GNU awk abbreviation Gawk, and gawk is the link file for awk, so the awk and gawk used on the system are the same. Gawk is a process-programming language. Gawk also supports all the functions that can be used in programming languages, such as conditional judgments, arrays, loops, and so on, so you can also call gawk a scripting language interpreter.

1) Usage format, options

Basic format: awk [options] ' program ' File ...

Program:pattern{action STATEMENT;.}, usually in single and double quotes;

Program: programming language;

Parttern:

mode; partially determines when an action statement triggers and triggers an event (begin,end)

ACTION STATEMENT:

Action statements, which can consist of multiple statements, are separated by semicolons: such as print,printf

Options (optional):

-F: Indicates the field delimiter used in the input;

-V var=value: Custom variable

separators, fields, and records

When Awk executes, a delimiter-delimited field (field) is marked $1,$2: $n called the domain identity. $ $ $ For all domains, note: and Shell variable $ characters have different meanings

Each line of the file is called a record

Omit action, the default is print $.

2) How awk works

Principle:

Awk also reads one line of text at a time while working with the text, then slices it according to the input delimiter (the default is a space character), cuts into n fragments, and then saves each piece to a variable within awk that is named $1,$2,$3 ... Wait until the last one, awk can process these fragments individually, such as showing a segment, a specific paragraph, or even additional processing of some fragments, such as counting, arithmetic, etc.

Here's how it works:

• First step: Perform begin{action; ...} Statements in a statement block;

• Step two: Read a line from a file or standard input (stdin) and execute the pattern{action ...} Statement block, which scans the file row by line, repeating the process from the first line to the last line until the file is fully read.

• Step three: Perform end{action when reading to the end of the input stream ...} Statement block

· The BEGIN statement block is executed before awk begins to read rows from the input stream, which is an optional block of statements, such as variable initialization, table-top statements for printed output tables, which can usually be written in the BEGIN statement block

· The end statement block is executed after awk reads all the rows from the input stream, such as the analysis results for all rows, such as a summary of information that is done in the end statement block, which is also an optional statement block

The General command in the pattern statement block is the most important part and is optional. If the pattern statement block is not provided, the default is {print}, which prints every row read to, and every row that awk reads executes the statement block.

Usage examples:

[[email protected] ~]# awk ' Begin{print "hello,awk!"} '
hello,awk! #BEGIN操作是第一步, you do not need to manipulate the file
[[email protected] ~]# awk ' End{print "bye,awk!"} '/etc/passwd
bye,awk! #END的操作实在文本处理之后

[Email protected] ~]# awk-f: ' Begin{print ' hello,awk! '} ' /root/{print $1,$2} ' end{print ' bye,awk! '} '/etc/passwd
hello,awk!
Root x
operator X
bye,awk!
#首先是进行BEGIN操作; Find the row with the root character, print the first and second columns, and finally perform the end operation


Usage notes:

1. Print

Points:

(1) comma delimiter;

(2) Each item of the output can be a string, or it can be a numeric value; The field, variable, or awk expression of the current record;

(3) If the item is omitted, it is equivalent to print $;


2. Variables

2.1 Built-in variables

Fs:input field seperator, default to white space characters;

Ofs:output field seperator, default to white space characters;

Rs:input record Seperator, enter the line break;

Ors:output record seperator, line break at output;


Nf:number of field, number of fields

{print NF}, {print $NF}

Nr:number of record, number of rows;

FNR: Each file is counted, the number of rows;


FileName: Current file name;


ARGC: The number of command-line arguments;

ARGV: An array that holds the parameters given by the command line;


2.2 Custom variables

(1)-V Var=value


Variable names are case-sensitive;


(2) directly defined in program


3. printf command


Formatted output: printf format, item1, ITEM2, ...


(1) format must be given;

(2) does not wrap automatically, you need to explicitly give the line-break control, \ n

(3) in format, you need to specify a format symbol for each item that follows.


Format characters:

%c: The ASCII code that displays the characters;

%d,%i: Displays decimal integers;

%e,%e: Numerical display of scientific counting method;

%f: Displayed as floating point number;

%g,%g: Displays values in scientific notation or floating-point form;

%s: Display string;

%u: unsigned integer;

Percent: show% itself;


Modifier:

#[.#]: The width of the first digital control display; The second # indicates the precision after the decimal point;

%3.1f

-: Align Left

+: Display symbols for numeric values


4. Operator


Arithmetic operators:

X+y, X-y, x*y, x/y, X^y, x%y

-X

+x: converted to numerical value;


String operator: unsigned operator, string connection


Assignment operators:

=, +=, -=, *=, /=, %=, ^=

++, --


Comparison operators:

>=, <, <=,! =, = =


Pattern-matching characters:

~: whether match

!~: does not match


Logical operators:

&&

||

!


Function call:

Function_name (ARGU1, ARGU2, ...)


Conditional expression:

Selector?if-true-expression:if-false-expression


# awk-f: ' {$3>=1000?usertype= ' Common User ': usertype= "Sysadmin or Sysuser";p rintf "%15s:%-s\n", $1,usertype} '/etc/ passwd


5. PATTERN


(1) Empty: null mode, matching each line;

(2)/regular expression/: Only the rows that can be matched to the pattern here are processed;

(3) Relational expression: The relationship expressions, the result is "true" has "false", the result is "true" will be processed;

True: The result is a value other than 0, not an empty string;

(4) Line ranges: range,

startline,endline:/pat1/,/pat2/


Note: Formats that give numbers directly are not supported

~]# awk-f: ' (nr>=2&&nr<=10) {print '} '/etc/passwd

(5) Begin/end mode

begin{}: Executes only once before starting to process the text in the file;

end{}: Executes only once after the text processing is complete;


6. Commonly used action


(1) Expressions

(2) Control statements:if, while and so on;

(3) Compound statements: combined statement;

(4) Input statements

(5) Output statements


7. Control statements


if (condition) {statments}

if (condition) {statments} else {statements}

while (Conditon) {statments}

Do {statements} while (condition)

for (EXPR1;EXPR2;EXPR3) {statements}

Break

Continue

Delete Array[index]

Delete array

Exit

{statements}


7.1 If-else


Syntax: if (condition) statement [Else statement]


~]# awk-f: ' {if ($3>=1000) {printf ' Common User:%s\n ', $ ' else {printf ' root or Sysuser:%s\n ', ' $ '} '/etc/passwd


~]# awk-f: ' {if ($NF = = "/bin/bash") print $ '/etc/passwd


~]# awk ' {if (nf>5) print $} '/etc/fstab


~]# Df-h | awk-f[%] '/^\/dev/{print $ ' | awk ' {if ($NF >=20) print $} '


Usage scenario: Make a conditional judgment on the entire row or field obtained by awk;


7.2 While Loop

Syntax: while (condition) statement

The condition "true", enters the circulation, the condition "false", exits the circulation;


Usage Scenario: Use when processing multiple fields in a row one at a time, using each element of an array in a single process;


~]# awk '/^[[:space:]]*linux16/{i=1;while (i<=nf) {print $i, length ($i); i++}} '/etc/grub2.cfg


~]# awk '/^[[:space:]]*linux16/{i=1;while (I<=NF) {if (length ($i) >=7) {print $i, Length ($i)}; i++}} '/etc/ Grub2.cfg


7.3 Do-while Cycle

Syntax: do statement while (condition)

Meaning: At least one loop body is executed


7.4 For Loop

Syntax: for (EXPR1;EXPR2;EXPR3) statement


For (variable assignment;condition;iteration process) {For-body}


~]# awk '/^[[:space:]]*linux16/{for (i=1;i<=nf;i++) {print $i, Length ($i)}} '/etc/grub2.cfg


Special usage:

Ability to iterate through the elements in an array;

Syntax: for (var in array) {For-body}


7.5 Switch statement

Syntax: switch (expression) {case VALUE1 or/regexp/: statement, Case VALUE2 or/regexp2/: statement; ...; default:statement}


7.6 Break and continue

Break [n]

Continue


7.7 Next


End the processing of the bank in advance and go directly to the next line;


~]# awk-f: ' {if ($3%2!=0) next; print $1,$3} '/etc/passwd


8. Array


Associative array: array[index-expression]


Index-expression:

(1) You can use any string; string to use double quotation marks;

(2) If an array element does not exist beforehand, when referenced, awk automatically creates this element and initializes its value to "empty string";


To determine if an element exists in an array, use the "index in array" format;


weekdays[mon]= "Monday"


To iterate through each element in the array, use the For loop;

for (var in array) {For-body}


~]# awk ' begin{weekdays["Mon"]= "Monday" weekdays["Tue"]= "Tuesday"; for (I in weekdays) {print Weekdays[i]}} '


Note: Var iterates through each index of the array;

state["LISTEN"]++

state["established"]++


~]# Netstat-tan | awk '/^tcp\>/{state[$NF]++}end{for (i in state) {print I,state[i]}} '


~]# awk ' {ip[$1]++}end{for (i in IP) {print i,ip[i]}} '/var/log/httpd/access_log


Exercise 1: Count the number of occurrences of each file system type in the/etc/fstab file;

~]# awk '/^uuid/{fs[$3]++}end{for (i in FS) {print I,fs[i]}} '/etc/fstab


Exercise 2: Count the occurrences of each word in the specified file;

~]# awk ' {for (i=1;i<=nf;i++) {count[$i]++}}end{for (i in count) {print I,count[i]}} '/etc/fstab


9. Functions


9.1 Built-in functions

Numerical Processing:

RAND (): Returns a random number between 0 and 1;


String processing:

Length ([s]): Returns the length of the specified string;

Sub (r,s,[t]): Finds the matched content in the character represented by T in the pattern represented by R and replaces it with the content represented by S for the first time;

Gsub (R,s,[t]): Finds the matched content in the character represented by T in the pattern represented by R and replaces all occurrences with the content represented by S;


Split (S,a[,r]): Cuts the character s with the R delimiter and saves the resulting cut to the array represented by A;


~]# Netstat-tan | awk '/^tcp\>/{split ($5,ip, ":"); Count[ip[1]]++}end{for (i in count) {print I,count[i]}} '


This article from "Wang Liming" blog, declined reprint!

The text of the Three Musketeers on Linux

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.