Sticky rice pill & "pit call beast" regular expression and meta-character usage

Source: Internet
Author: User
Tags expression engine egrep

Before we introduce regular expressions, let's first popularize the three big text search tools grep, sed, awk under Linux. Their respective functions are as follows:

grep:grep (supports regular expressions) Egrep (supports extended expression), fgrep (regular expression not supported by shortcut search): Text Search tool; Search operations on given text based on a set of filter criteria ;

Sed: Stream editor,stream editors, line editing tools, text editing tools; (essentially editor)

awk:GNU awk, text Formatting tool, Text Report generator, (text beautification, can be understood as text version of the United States 美图秀秀)

Regular Expressions :

"A pattern written by a class of special characters and text characters, whose characters do not represent their literal meaning, but are used to denote the function of control or wildcard", can be understood as a sequence of specific meanings of a set of filtering rules formed by the use of symbols of certain meanings, as required.
For example, "sex = not male or female"

Note: (the default is greedy mode, how many matches how much to match)

It is used in the following ways:
grep command:

Grep[options]-option pattern-filter condition [FILE ...] -Action File

Common options:
--color=auto: Color display of the text that meets the requirements;
-I: case-insensitive character;
-O: Only text that meets the filter criteria is displayed;
-V,--invert-match: reverse match; ()
-E: Support for extended regular expressions;
-Q,--quiet,--silient: Silent mode, do not output any information; (sometimes when we are returning massive amounts of data, we just want to see the command in a failed state, use this mode, "Hello I also good")

Basic Regular Expression metacharacters: (that is, characters that represent specific meanings)
Character matching:

.: matches any single character;

650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M01/7D/4D/wKiom1blNtDhAsAlAAAz9JGT_8Q560.png "title=" 1.png " alt= "Wkiom1blntdhasalaaaz9jgt_8q560.png"/>

[]: matches any single character within the range;

650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M01/7D/4D/wKiom1blN4mTd54CAAAhsOHDwDo361.png "title=" 2.png " alt= "Wkiom1bln4mtd54caaahsohdwdo361.png"/>

[^]: matches any single character outside the range;

[:d Igit:] Number

[: Lower:] lowercase letters

[: Upper:] Uppercase

[: Alpha:] All letters

[: Alnum:] All numbers and letters

[: Space:] white space characters

[:p UNCT:] Punctuation

Number of matches:

*: matches the preceding character any time (0,1 or more);

grep "X*y": xxxyabcyabcabcxyabcy


. *: Any character of any length;
\+: Matches the preceding character at least 1 times;

grep "X\+y": xxxyabcyabcabcxyabcy

\?: matches the preceding 0 or 1 times, that is, the preceding character is optional;

grep "X\?y": xxxyabcyabcabcxyabcy

\{m\}: The preceding character appears m times and M is a nonnegative integer;

grep "X\{2\}y": xxxyabcyabcabcxyabcy

\{m,n\}: The preceding character appears m times and M is a nonnegative integer; [M,n]
\{0,n\}: Up to n times;
\{m,\}: at least m times;

Position anchoring

Restricts the use of pattern search text, which restricts the text that the pattern matches to where it appears only in the target text;

^: Anchor at the beginning of the line; for the leftmost side of the pattern, ^pattern
$: End of line anchoring; for the right side of the pattern, pattern$
^pattern$: To make PATTERN exactly match a whole line;
$: blank line;
^[[:space:]]*$:

Word: A continuous character consisting of non-special characters (a string) is called a word;

\< or \b: The initial anchor for the left side of the word pattern, formatted as \<pattern, \bpattern
\> or \b: The ending anchor for the right side of the word pattern, formatted as PATTERN\&GT; pattern\b
\<pattern\>: Word anchoring;

650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M00/7D/4B/wKioL1blPEzTPjUbAAA9N2h62CE273.png "title=" 3.png " alt= "Wkiol1blpeztpjubaaa9n2h62ce273.png"/>

650) this.width=650; "src=" Http://s4.51cto.com/wyfs02/M01/7D/4D/wKiom1blO8TxsyOUAAA-oYA6d1s933.png "title=" 4.png " alt= "Wkiom1blo8txsyouaaa-oya6d1s933.png"/>


grouping and referencing:
\ (pattern\): The character matching this PATTERN is treated as a non-infringing whole;

Note: The patterns in the grouping brackets match the characters that are automatically recorded in the internal variables by the regular expression engine, which are \1, \2, \3, ...

Pat1\ (pat2\) pat3\ (pat4\ (pat5\) pat6\)

\ n: The string that matches the pattern between the nth opening parenthesis in the pattern and the closing parenthesis that matches it (not the pattern, but the result of the pattern match)

\1: The string that the pattern in the first set of parentheses matches to;
\2: The string that the pattern in the second set of parentheses matches to;
Back reference: Refers to the string that matches the pattern in the preceding parentheses;

Two common options:
-E,--extended-regexp: supports the use of extended regular expressions
-F,--fixed-strings: Supports the use of fixed strings, does not support regular expressions, and is equivalent to Fgrep;
-G,--basic-regexp: Supports the use of basic regular expressions;
-P,--perl-regexp: supports the use of pcre regular expressions;

-E pattern,--regexp=pattern: multi-mode mechanism;
-F file,--file=file:file a text file containing a pattern for each line, the grep script;

-a NUM, more realistic it's the back row
-B NUM, more realistic it's previous line
-C NUM, multi-realistic and up-down line

Egrep:

The grep command that supports the use of extended regular expressions is equivalent to GREP-E;

Egrep [OPTIONS] PATTERN [FILE ...]

Extend the metacharacters of regular expressions:
Character matching:


.: Any single character
[]: Any single character within the range
[^]: Any single character outside the range

Number of matches:


*: any time;

?: 0 or 1 times;

+:1 or multiple times;

{m}: matches m times;

{M,n}: At least m times, up to n times;

{0,n}

{m,}

Location anchoring:


^: Beginning of the line

$: End of line

<, \b: The head of the word

\>, \b: suffix

grouping and referencing:

(pattern): a grouping in which the pattern in parentheses matches to a character that is recorded in a variable inside the hermetical expression engine;

Back reference: \1, \2, ...
Or:

A|b:a or B

C|cat: Indicates C or cat

(C|C) at: Indicates Cat or cat

Glutinous rice pill & "pit call beast" regular expression and meta-character use  

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.