Python Regular Expressions

Last Update:2017-10-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Metacharacters: Match character, match position, match number, match pattern.

List of common meta characters

. Match any character other than line break

\b Match the beginning or end of a word

\d Matching numbers

\w matches letters, numbers, underscores, or kanji

\s matches any whitespace character, including spaces, tabs (tab), line breaks, Chinese full-width spaces, and so on

^ Start of matching string

$ match End of string

Common qualifiers

* Repeat 0 or more times

+ Repeat one or more times

? Repeat 0 or one time

{n} repeats n times

{n,} repeats n or more times

{N,m} repeats n to M times

Common anti-semantic characters

\w matches any character that is not alphabetic, numeric, or underlined

\s matches any character that is not a white letter

\d matches any non-numeric character

\b Match is not where the word starts or ends

[^a] matches any character except a

[^ABCDE] matches any character other than the letters A, B, C, D, E

[^ (123|ABC)] matches any character except the characters 1, 2, 3, or a, B, c

Back reference:

Using parentheses to specify an expression can be considered a grouping. By default, each grouping automatically has a group number, with the rule: left to right, with the left parenthesis of the group as the flag, and the first occurrence of the group number 1

Table common groupings of the situation

Categorical grammatical meanings

(exp) matches exp, and captures text into an automatically named group

Capture (? P<NAME>EXP) capture Exp, and capture Wenben to a group named name

(?: EXP) matches exp, does not capture matching text, and does not assign group numbers to this group

(? =exp) matches the position of the exp front

0 Wide assertion (? <=exp) matches the position after exp

(?! EXP) match the location followed by the EXP

(? <!exp) matches a position that is not previously exp

Note (? #comment) This type of grouping does not have any effect on the processing of regular expressions and is used only to provide comments for people to read

0 width assertion: ' \b ', ' ^ ' matches a position, and this position needs to meet certain conditions, we call this condition an assertion or a 0-width assertion.

0 width Positive lookahead assertion: (? =exp), he asserts that the back of the position can match the expression exp. for example [a-z]* (? =exp) matches the portion ending in ing, finding I love cooking and singing will match cook and sing.

The antecedent assertion is performed by finding the first "ing" from the very right side of the character to match, and then matching the preceding expression, or finding the second one if it does not match.

0 width is recalling the post assertion: (? <=exp), he asserts that the front of this position can match the expression exp. For example (<=ABC). * Matches the following part of a string beginning with ABC, can match abcdefgabc in Defgabc Two is not ABCDEFG, the latter assertion and antecedent assertion just opposite, he starts from the leftmost end of the string to match to find the assertion expression, The subsequent string is then matched, and if it does not match, the second assertion expression continues to be found, so repeat.

0 width Negative lookahead assertion: (?! EXP) asserts that after this position cannot match the expression exp. Like \b (?! ABC) \w) +\b matches words that do not contain continuous string ABC

0 width Service The assertion (? <!exp) asserts that the front of this position cannot match the expression exp. For example (? <![ A-z] \d{7} matches a seven-digit number that is not preceded by a lowercase letter.

Used to match content within simple HTML tags that do not contain attributes (?<=< (\w+) >). * (? =<\/\1).

Greed and laziness

When a regular expression contains a qualifier that can accept duplicates, the usual behavior is to match as many characters as possible (the entire expression matches).

This is greedy mode. Take the expression a\w+b as an example, if the search a12b34b as many matches as possible, the left will match the entire a12b34b instead of the a12b,

But if you want to match the a12b, how to deal with it? We're going to turn on lazy mode and change the a\w+b to A\w+?b.

How lazy qualifiers are used

*? Repeat any number of times, but repeat as little as possible

+? Repeat 1 or more times, but repeat as little as possible

?? Repeat 0 or 1 times, but repeat as little as possible

{n,m}? Repeat N to M times, but as few repetitions as possible

{n,}? Repeat more than n times, but repeat as little as possible

Table Python's matching rules

Syntax meaning expression full match string

\a only matches string beginning \AABC ABC

\z only matches string end abc\z ABC

(? p<name>) grouping, in addition to the original number external specifies an additional alias (? P<WORD>ABC) {2} ABCABC

(? P=name) refers to a string (?) that is matched to a grouping of <name> aliases. p<id>\d) ABC (? P=id) 1abc1,5abc5

The method of matching processing in Python is mainly through several methods inside the RE module.

Python Regular Expressions

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python Regular Expressions

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python Regular Expressions

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support