Linuxawk regular expressions and regular operators

Source: Internet
Author: User
Regular expressions are indispensable when awk is used as a text processing tool. Master the regular expression usage of this tool. In fact, we don't have to learn its regular expressions separately. A regular expression is like a program language. it has its own syntax rules to express its meaning.

Regular expressions are indispensable when awk is used as a text processing tool. Master the regular expression usage of this tool. In fact, we don't have to learn its regular expressions separately. A regular expression is like a program language. it has its own syntax rules. For different tools, most of them mean the same. Regular expressions are used in many linux text processing tools (awk, sed, grep, perl. In fact, there are only three types. For details, refer to: linux shell regular expression (BREs, EREs, PREs) difference comparison. As long as some tools belong to some type of regular expressions. Then its syntax rules are basically the same. Through that article, we know that the Regular Expression of awk is an Extended Regular Expression (Extended Regular Expression is also called Extended RegEx EREs ).

1. Introduction to Basic Expression symbols of awk Extended Regular Expression (ERES)

Character Function
+ Specify that if the specific value of one or more characters or extended regular expression (before + (plus sign) is in this string, the string matches. Command line:

Awk '/smith + ern/' testfile

Will contain charactersSmit, Followed by one or moreHCharacters in lengthErnAny records of the ending string are printed to the standard output. The output in this example is:

Smithern, harry smithhern, anne

? Specify the value of a zero or one character or extended regular expression (in? (Question mark) in the string, the string matches. Command line:

Awk '/smith? /'Testfile

Will contain charactersSmit, Followed by zero or oneHAll records of the instance of the character are printed to the standard output. The output in this example is:

Smith, alan smithern, harry smithhern, anne smitters, alexis

| Specify that if any of the strings separated by a vertical line is in the string, the string matches. Command line:

Awk '/allen | alan/' testfile

Will contain stringsAllenOrAlanAll records are printed to the standard output. The output in this example is:

Smiley, allen smith, alan

() In a regular expression, strings are combined. Command line:

Awk '/a (ll )? (Nn )? E/'testfile

Will have a stringAEOrAlleOrAnneOrAllnneAll records are printed to the standard output. The output in this example is:

Smiley, allen smithhern, anne

{M} Specifies that the string matches if the specific value of m mode is located in the string. Command line:

Awk '/l {2}/' testfile

Print to standard output

Smiley, allen

{M ,} Specify that if the value of at least m modes is in the string, the string matches. Command line:

Awk '/t {2,}/'testfile

Print to standard output:

Smitters, alexis

{M, n} Specify that if the value of the mode between m and n (including m and n) is in the string (where m <= n), the string matches. Command line:

Awk '/er {1, 2}/'testfile

Print to standard output:

Smithern, harry smithern, anne smitters, alexis

[String] The specified regular expression matches any character specified by the String variable in square brackets. Command line:

Awk '/sm [a-h]/'testfile

Will haveSmFollowedAToHAll records of any characters in the arrangement are printed to the standard output. The output in this example is:

Smawley, andy

[^ String] In [] (square brackets) and ^ (insert mark) at the beginning of the specified string, it indicates that the regular expression does not match any character in square brackets. In this way, the command line:

Awk '/sm [^ a-h]/'testfile

Print to standard output:

Smiley, allen smith, alan smithern, harry smithhern, anne smitters, alexis

~,!~ A condition statement that specifies that the variable matches the regular expression (font size) or does not match (font size or exclamation point. Command line:

Awk '$1 ~ /N/'testfile

Include the first field with charactersNAll records are printed to the standard output. The output in this example is:

Smithern, harry smithhern, anne

^ Specifies the start of a field or record. Command line:

Awk '$2 ~ /^ H/'testfile

The characterHAll records that are the first character of the second field are printed to the standard output. The output in this example is:

Smithern, harry

$ Specifies the end of a field or record. Command line:

Awk '$2 ~ /Y $/'testfile

The characterYAll records that are the last character of the second field are printed to the standard output. The output in this example is:

Smawley, andy smithern, harry

. (Full stop) Any character except the terminal line break character at the end of the blank. Command line:

Awk '/a .. e/' testfile

It will have characters separated by two charactersAAnd all records of e are printed to the standard output. The output in this example is:

Smawley, andy smiley, allen smithhern, anne

* (Asterisk) Represents zero or more arbitrary characters. Command line:

Awk '/a. * e/' testfile

Characters that are separated by zero or more charactersAAnd all records of e are printed to the standard output. The output in this example is:

Smawley, andy smiley, allen smithhern, anne smitters, alexis

\ (Backslash) Escape characters. When it is located before any character that has special meanings in an extended regular expression, escape characters remove any special meanings of this character. For example, the command line:

/\/\//

It will match Pattern a //, because the backslash denies the Slash as the common meaning of the regular expression separator. To specify the backslash as a character, use a double backslash. For more information about the backslash and its usage, see the following content about escape sequences.


Compared with PERs, the main difference is that some of the combined type specifiers are missing: "\ d, \ D, \ s, \ S, \ t, \ v, \ n, \ f, \ r "other functions are basically the same. Our common software: Regular expressions supported by javascript,. net, and java are basically the EPRs type.

II. awk commonly calls regular expression methods

In the awk statement:


Copy codeThe code is as follows:
Awk '/REG/{action }'
/REG/is a regular expression. you can send a condition record in $0 to action for processing.

Awk regular expression statement (~,~! Equivalent !~)


Copy codeThe code is as follows:
[Chengmo @ centos5 ~] $ Awk 'In in {info = "this is a test"; if (info ~ /Test/) {print "OK "}}'
OK

Awk built-in regular expression functions


Copy codeThe code is as follows:
Gsub (Ere, Repl, [In])
Sub (Ere, Repl, [In])
Match (String, Ere)
Split (String, A, [Ere])

For detailed function usage, refer to: linux awk built-in function for details (instance)

As described above, I wonder if you have a clearer understanding of the awk regular expression. If you have any questions, please contact me!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.