Python uses regular expressions

Last Update:2014-05-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Use regular expressions in python

1. Matching characters

The metacharacters in the regular expression are. ^ $ * +? {} [] \ | ()

The matching characters are in the following modes:

\ D match any number

\ D match any non-digit

\ S matches any blank characters

\ S matches any non-space characters

\ W matches any number or letter

\ W matches any non-digit or letter

2. Regular Expression

In python, compile is used to process regular expressions, for example:

Import re;

P = re. compile ('[a-c]');

P. match (s );

S is the string to be matched, and match is the matching method. Similar methods include

Match () Determines matching starting from the beginning of the line

Search () matches at any location

Findall () finds all matched substrings and returns them as substrings.

Finditer () finds all matched substrings and returns them in the form of an iterator.

There are also many methods to match, such:

Group () returns the string matching the regular expression.

Start () returns the matched start point.

End () returns the matched end point.

Span () returns the matched (start, end) tuples.

Example 1: >>> import re;

>>> P = re. compile ('^ [a-c]')

>>> Q = p. match ("abcd ");

>>> Print q. group ()

>>>> Q. span ()

(0, 2)

Example 2:

>>> Import re

>>> P = re. compile ('\ d + ');

>>> Q = p. findall ('1 and 10 and 20 ');

>>> Print q

['1', '2', '3']

The above matching can also be in another form:

Re. match ('\ d +', 'd23r ')

Example 3:

>>> P = re. match ('\ d +', 'd23r ')

>>> Print p

None

Other matching parameters:

Re. compile ('[a-c]', re. I) re. I indicates case-insensitive

Re. compile ('^ AB $', re. M) re. M indicates that ^ or $ matches the beginning and end of a row and the end of a string. If this flag is not added, it will only match the start and end of the string.

Example 4:

Re. compile (""

[1-3] #1-3

[A-c] # a-c

"", Re. VERBOSE

) Re. VERBOSE enables the regular expression to appear in multiple rows, and you can add comments to each row.

The above match is equivalent to re. compile ('[1-3] [a-c]')

3. Group

Use () for grouping

Example 5:

>>> P = re. compile ('(12) + ')

>>> M = p. match ('20140901 ')

>>> Print m. group ()

121212

The above match is 12 repeat once or multiple times

You can also print group information,

>>> Print m. group (1)

Python automatically captures the group information. If you do not want to capture the group information, can you use? :

Example 6:

>>> Import re

>>> S = "hello ab1cd ";

>>> P = re. search ('(? : H. *) (a. *) (c .*)');

>>> Print "a * {0}". format (p. group (1 ))

A * AB

>>> Print "c * {0}". format (p. group (2 ))

C * cd

P. group (0) stores the matching of the entire expression, p. group (1) Stores (. *) Matching information, p. group (2) Stores (B. *), while h. * Because there are? : Not captured

If there are too many groups, it is still difficult to use group labels. In this case, you can name the groups and use them by name.

Example 7:

>>> Import re;

>>> S = "hello ab1cd"

>>> P = re. search ('(? P <a> .*)(? P <c> c .*)');

>>> Print "a * {0}". format (p. group ('A ')

A * AB

>>> Print "c * {0}". format (p. group ('C '))

C * cd

4. Greedy and non-Greedy Models

In greedy mode, * + matches as many characters as possible, for example:

Example 8:

>>> Import re;

>>> P = re. compile ('

>>> M = p. findall ('

>>> Print m;

['<H1>

Sometimes you want it to match two results:

Example 9:

>>> Import re;

>>> P = re. compile ('

>>> M = p. findall ('

>>> Print m;

['<H1>

5. forward and backward delimiters

If Mode A is matched first and mode B is matched, (? = B). If you first match Mode A without B, you can use (?! B ).

Example 10:

>>> Import re;

>>> S = "ab2cd"

>>> M = re. search ("ab2 (? = Cd) ", s );

>>> Print m. group ();

Ab2cd

Example 11:

>>> Import re;

>>> S = 'ab2cd'
>>> M = re. search ('ab2 (?! Cd) ', s );
>>> Print m

None

Similarly, if pattern B is matched and pattern A needs to be present before it, you can use (? <= A) B format,

If pattern B is matched and there is no pattern A before it, you can use (? <! A) Form of B

Example 12:

>>> Import re;

>>> S = "ab2cd ";

>>> M = re. search ('(? <= Ab2) cd ', s)

>>> Print m. group ()

Example 13:

>>> Import re

>>> String = "ab2cd"
>>> Pattern = re. search (R '(? <! Ab2) cd ', string)
>>> Print pattern;

None

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python uses regular expressions

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support