awk Introduction and Learning Notes collection 1th/3 page _php Digest

Source: Internet
Author: User
copyright©2004 This article complies with the GPL agreement, welcome to reprint, revise, spread.

First release date: August 6, 2004


--------------------------------------------------------------------------------

Table of Contents

1. awk Introduction
2. awk command format and options
2.1. AWK has two forms of syntax
2.2. Command options
3. Mode and operation
3.1. Mode
3.2. Operation
4. AWK's Environment variables
5. awk operator
6. Records and Domains
6.1. Record
6.2. Domain
6.3. Domain Separator
7. Gawk Special Regular expression meta-characters
8. POSIX character Set
9. Matching operator (~)
10. Comparison expressions
11. Scope Template
12. An example of verifying the validity of a passwd file
13. Several examples
AWK programming
14.1. Variable
14.2. Begin Module
14.3. End Module
14.4. Redirect and Piping
14.5. Conditional statement
14.6. Cycle
14.7. Array
14.8. Awk's built-in functions
How-to
1. awk Introduction
Awk is a programming language that is used to process text and data under Linux/unix. Data can come from standard input, one or more files, or the output of other commands. It supports user-defined functions and dynamic regular expressions and other advanced functions, is a powerful programming tool under Linux/unix. It is used on the command line, but more is used as a script. awk handles text and data in such a way that it scans the file line by row, from the first line to the last line, looking for a matching row of specific patterns and doing what you want on those lines. If no processing action is specified, the matching row is displayed to the standard output (screen), and if no pattern is specified, all rows specified by the action are processed. Awk represents the first letter of its author's surname, respectively. Because its author is three people, respectively Alfred Aho, Brian Kernighan, Peter Weinberger. Gawk is the GNU version of AWK, which provides some extensions to the Bell Lab and GNU. The following awk is an example of gun gawk, where awk has been linked to gawk in the Linux system, so the following are all described in awk.

2. awk command format and options
2.1. AWK has two forms of syntax
awk [Options] ' script ' Var=value file (s)

awk [Options]-F scriptfile var=value file (s)

2.2. Command options
-F FS or--field-separator FS
Specifies the input file break separator, which is either a string or a regular expression, such as-f:.

-V Var=value or--asign var=value
Assigns a user-defined variable.

-F scripfile or--file ScriptFile
Reads the awk command from the script file.

-MF nnn AND-MR nnn
Set intrinsic limits on NNN values,-MF options limit the maximum number of blocks assigned to NNN;-MR option limits the maximum number of records. These two features are extensions to the Bell Lab version of awk and are not applicable in standard awk.

-W Compact or--compat, W traditional or--traditional
Run awk in compatibility mode. So Gawk's behavior is exactly the same as the standard awk, and all awk extensions are ignored.

-W copyleft or--copyleft, W-Copyright or--copyright
Print short copyright information.

-W Help or--help,-w usage or--usage
Print all awk options and a short description of each option.

-W Lint or--lint
Print warnings for structures that cannot be ported to a traditional UNIX platform.

-W lint-old or--lint-old
Print a warning about a structure that cannot be ported to a traditional UNIX platform.

-W POSIX
Open compatibility mode. However, the following restrictions are not recognized: \x, function keywords, func, code-changing sequences, and when FS is a space, the new row is used as a field separator; the operator * * and the **= cannot replace ^ and ^=;fflush invalid.

-W re-interval or--re-inerval
Allow interval regular expressions to be used, references (POSIX character classes in grep), such as bracket expressions [[: Alpha:]].

-W source Program-text or--source Program-text
Use Program-text as the source code and can be mixed with the-f command.

-W version or--version
Prints the version of the bug report information.

3. Mode and operation
awk scripts are made up of patterns and operations:
Pattern {action} such as $ Awk '/root/' test, or $ Awk ' $ < ' test.

Both are optional, and if there is no pattern, the action is applied to all records, and if there is no action, the output matches all records. By default, each input row is a record, but the user can specify a different delimiter to delimit through the RS variable.

3.1. Mode
The pattern can be any of the following:

/Regular expression/: An extension set that uses wildcard characters.

Relationship Expressions: You can use the relational operators in the following table of operators to manipulate them, which can be a string or a comparison of numbers, such as $2>%1 Select a row with a second field that is more than the first word.

Pattern-matching expressions: Using operators ~ (matching) and ~! (does not match).

Mode, Mode: Specifies the range of a row. This syntax cannot include the Begin and end modes.

BEGIN: Lets the user specify the action that occurs before the first input record is processed, where the global variable can usually be set.

End: The action that the user takes after the last input record is read.

3.2. Operation
An action consists of one or more commands, functions, and expressions, separated by a newline character or semicolon, and enclosed in curly braces. There are four main parts:

Variable or array assignment

Output command

Built-in functions

Control Flow Command

4. AWK's Environment variables
Table 1. Environment variables for awk

Variable description
$n the nth field in the current record, separated by FS.
$ complete Input record.
The number of ARGC command-line arguments.
The location of the current file in the Argind command line, starting at 0.
ARGV the array that contains the command-line arguments.
CONVFMT numeric conversion format (default value is%.6g)
ENVIRON an associative array of environment variables.
ERRNO a description of the last system error.
FieldWidths the field width list (separated by the spacebar).
FileName current filename.
FNR with NR, but relative to current file.
FS field Separator (default is any space).
IGNORECASE if True, a match is ignored for case-insensitive.
NF the number of fields in the current record.
NR current record number.
OFMT the output format of the number (the default value is%.6g).
OFS Output field separator (the default is a space).
ORS output record delimiter (the default is a newline character).
Rlength the length of the string that is matched by the match function.
RS Record Separator (default is a newline character).
Rstart the first position of the string that is matched by the match function.
Subsep array subscript delimiter (default is \034).
Current 1/3 page 123 Next read the full text

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.