Learning and using awk in Regular Expressions

Source: Internet
Author: User

AWK is an excellent text processing tool. It is not only one of the most powerful Data Processing engines in linux, but also in any environment. The maximum functionality of this programming and database access language (its name is derived from the first letter of its founder Alfred Aho, Peter Weinberger, and Brian Kernighan) depends on a person's knowledge.
I. Usage of AWK:
1. Direct use in command line mode. Format: awk 'pattern' {action }'
2. Write the awk command to the script and run #! The/bin/awk-f command interpreter acts as the first line of the script, sets the executable permission of the script, and calls it by typing the Script Name. The format is./testscript. awk filename. It is equivalent to the shell script method.
3. Insert the awk command into a separate file and execute it. Format: awk-f awkscript filename
The first common format is awk 'in in {print "this is the start"} {print $1, $2, $3} END {print "this is the end"} 'filename. The BEGIN and END are in the middle of the mode, and the END is an action. BEGIN and END can be omitted. The mode is generally used for matching search.
Ii. AWK troubleshooting Common Errors
1. Make sure that the entire AWK is enclosed in single quotes; 2. Make sure that all the brackets or quotation marks in single quotes appear in pairs; 3. Make sure to use braces to enclose action statements and use parentheses to enclose condition statements; 4. Sometimes you need to check whether there are file names or BEGIN.
Iii. AWK built-in string variable functions

Gsub (r, s) Replace r with s in $0
Gsub (r, s, t) Replace r with s in t
Index (s, t) Returns the first position of string t in s.
Length (s) Returns the length of s.
Match (s, r) Test whether s contains a string matching r.
Split (s, a, fs) In fs, s is divided into sequence a and placed in array. Returns the number of segments,
Sprint (fmt, exp) Returns the exp formatted by fmt.
Sub (r, s) Replace s with the leftmost longest substring in $0
Substr (s, p) Returns the suffix starting with p in string s.
Substr (s, p, n) Returns the suffix of string s starting from p and ending with n.


Iv. built-in environment variables of AWK

$ N The Nth field of the current record. The fields are separated by FS.
$0 Complete input records. In the text, it generally refers to the row information content.
ARGC The number of command line parameters.
ARGIND Location of the current file in the command line (starting from 0 ).
ARGV Array containing command line parameters.
CONVFMT Number conversion format (default value: %. 6g)
ENVIRON Environment Variable join array.
ERRNO Description of the last system error.
FIELDWIDTHS Field width list (separated by Space key ).
FILENAME The current file name.
FNR Same as NR, but relative to the current file.
FS Field separator (any space by default ). You can specify
IGNORECASE If it is true, case-insensitive matching is performed.
NF The number of fields in the current record. $ NF indicates the content of the last segment.
NR Number of current records. Number of lines in the text
OFMT The output format of the number (default value: %. 6g ).
OFS Delimiter of the output field (the default value is a space ).
ORS The output record delimiter (the default value is a line break ).
RLENGTH The length of the string matched by the match function.
RS Record separator (a line break by default ).
RSTART The first position of the string matched by the match function.
SUBSEP Array subscript separator (default value: \ 034 ).

5. built-in AWK Operators

= + =-= * =/= % = ^ = ** = If the value is a + = 10, it indicates a + 10.
? : C condition expression, a> B? A: B indicates that a is greater than B. If true, return a. If false, return B.
| Logic or. If one is true, it is true.
&& Logic and. If one is false, it is false. Both are required to be true at the same time.
~ ~! Match Regular Expressions and do not match regular expressions. Often search for text
<<=>>=! === Relational operators
Space Connection
+- Add, subtract
*/& Multiplication, division and remainder
+ -! Mona1 addition, subtraction, and non-logical
^ *** Power
++ -- Increase or decrease as the prefix or suffix. This must be clearly differentiated between ++ a and a ++
$ Field Reference
In Array Member


6. AWK metacharacters: \ (Escape Character), ^ $ [] | * + ?. + And? It can only be used in AWK and cannot be used in sed or grep. + Indicates matching one or more characters ;? Matches 0 or 1 character.

Common sed and awk Functions

Common grep \ awk \ sed syntax in shell programming in Linux

Shell programming in Linux-awk Programming

Awk

Linux awk command usage

For more details, please continue to read the highlights on the next page:

  • 1
  • 2
  • Next Page

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.