Python Regular Expressions

Last Update:2017-11-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Regular expressions are not part of Python. Regular expressions are powerful tools for working with strings, with their own unique syntax and an independent processing engine, which may not be as efficient as Str's own approach, but very powerful.

Thanks to this, the syntax for regular expressions is the same in the language that provides regular expressions, except that the number of grammars supported by different programming languages differs.

The approximate matching process for regular expressions is to take out the expression and string comparisons in the text, and if each character matches successfully, the match succeeds and the match fails if there is a string that matches unsuccessfully.

The following is a list of the regular expression metacharacters and syntax supported by Python.

**Character**
Character	Description	Matching objects	Match Results
General characters	Match itself	Abc	Abc
.	Any character other than line break	A.c	Abc/acc/a2c
Escape character	Escape character, so that the latter character changes the original meaning. For example, Match , you can \	A\*c	A*c
[]	Character. Matches any one of the characters in the character set. can be abbreviated, For example [0-9], [A-z], there are some special characters, such as [^ ...], in addition to ... Outside [] Match This character, in the character set the special characters have lost the original meaning.	A[1-9e-g]c	A3c/agc

**Pre-defined character set**
Character	Description	Matching objects	Match Results
\s	White space characters	A\sc	A C
\s	Non-whitespace characters	A\sc	A1C or ABC
\d	numeric characters	A\dc	A2c
\d	Non-numeric characters	A\dc	Adc
\w	Numbers, letters, underscores	A\wc	A_c
\w	Non-alphanumeric underline	A\wc	A C
\ n	Line break
\ t	Tabs

**Quantity words**
Character	Description	Matching objects	Match Results
*	Match 0 or more arbitrary characters	Lee. *	Li/li Yang/li Jie silly ...
？	Match 0 or 1 arbitrary characters	Li.?	Lee/Li Jie
+	Match 1 or more arbitrary characters	Li. +	Li Yang/li Jie silly ...
{m}	Match m	Lee {3}	Li Li
{M,}/{,n}	Match 0-n or greater than M	Lee {2,}	Li Li/Lilili/Lee Li Lili.
{M,n}	Match m to n characters	Lee {3,5}	Li Li/Lilili/Lee Li Lili.

**Boundary matching**
Character	Description	Matching objects	Match Results
^	Start with what?
$	At what end?
\a	Match string start only
\z	Match string End only

The methods commonly used in re modules

(1) FindAll: Find all matching objects and return a list

>>> Re.findall ('o','Hello,world')['o','o']>>> Re.findall ('L','Hello,world')['L','L','L']

(2) Search: Find First, return value print with group ()

 >>> re.search ( " l   ", "  hello,world   " )  <_sre. Sre_match object; Span= (2, 3), Match= " l  "  >>>> ret = re.search ( " l  , "  hello, World   " )  >>> Ret.group () Span style= "COLOR: #008000" >#   use Group () to print   '

(3) match: Match from start, return value is printed with group ()

>>> Ret1 = Re.match ('L','Hello,world')#must match the first character>>>Ret1.group () Traceback (most recent): File"<stdin>", Line 1,inch<module>Attributeerror:'Nonetype'object has no attribute'Group'>>> Ret1 = Re.match ('H','Hello,world')>>>Ret1.group ()'H'

(4) Split: segmentation based on regular expression method

>>> Ret2 = Re.split ('[AC]','ABCACD')>>>ret2["','b',"',"','D'] #in a split, return ' and ' BCACD#split the BCACD to C, back to a B#Split ACD, with a split, return an empty '#split the CD, split in C, return an empty ' ', the remaining one D cannot be split

It is more flexible to slice a string with regular expressions than to use a fixed character.

>>> Re.split ('\s','a b c')['a','b',"','C']>>> Re.split ('\s+','a b c')#a "+" must be added['a','b','C']>>> Re.split ('[\s,]','A, B, C')['a',"','b',"',"','C']>>> Re.split ('[\s,]+','A, B, C')#you can match multiple spaces with a plus sign. ['a','b','C']

(5) SUB/SUBN: Replace with regular expression method

>>> Re.sub ('\d','NUM','my old is')'my old is Numnum'>>> Re.subn ('\d','NUM','my old is')('my old is Numnum', 2)

(6) Compile: Compile regular, compile the object that will match in advance

>>> obj = re.compile ('123')>>> ret4 = Re.search (obj,' ABC123456CDA ' )>>> ret4.group ()'123'

(7) Finditer: Returns an iterator

>>> Ret5 = re.finditer ('l','hello,world') ' __next__ ' inch dir (ret5) True

Python Regular Expressions

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python Regular Expressions

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python Regular Expressions

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support