Python Regular Expressions

Source: Internet
Author: User

Regular expressions are not part of Python. Regular expressions are powerful tools for working with strings, with their own unique syntax and an independent processing engine, which may not be as efficient as Str's own approach, but very powerful.

Thanks to this, the syntax for regular expressions is the same in the language that provides regular expressions, except that the number of grammars supported by different programming languages differs.

The approximate matching process for regular expressions is to take out the expression and string comparisons in the text, and if each character matches successfully, the match succeeds and the match fails if there is a string that matches unsuccessfully.

The following is a list of the regular expression metacharacters and syntax supported by Python.

Character
Character Description Matching objects Match Results
General characters Match itself Abc Abc
. Any character other than line break A.c Abc/acc/a2c
Escape character Escape character, so that the latter character changes the original meaning. For example, Match *, you can \* A\*c A*c
[]

Character. Matches any one of the characters in the character set. can be abbreviated,

For example [0-9], [A-z], there are some special characters, such as [^ ...], in addition to ... Outside

[*] Match * This character, in the character set the special characters have lost the original meaning.

A[1-9e-g]c A3c/agc

Pre-defined character set
Character Description Matching objects Match Results
\s White space characters A\sc A C
\s Non-whitespace characters A\sc A1C or ABC
\d numeric characters

A\dc

A2c
\d Non-numeric characters A\dc Adc
\w Numbers, letters, underscores A\wc A_c
\w Non-alphanumeric underline A\wc A C
\ n Line break

\ t Tabs

Quantity words
Character Description Matching objects Match Results
* Match 0 or more arbitrary characters Lee. * Li/li Yang/li Jie silly ...
Match 0 or 1 arbitrary characters Li.? Lee/Li Jie
+ Match 1 or more arbitrary characters Li. + Li Yang/li Jie silly ...
{m} Match m Lee {3} Li Li
{M,}/{,n} Match 0-n or greater than M Lee {2,} Li Li/Lilili/Lee Li Lili.
{M,n} Match m to n characters Lee {3,5} Li Li/Lilili/Lee Li Lili.

Boundary matching
Character Description Matching objects Match Results
^ Start with what?
$ At what end?
\a Match string start only
\z Match string End only

The methods commonly used in re modules

(1) FindAll: Find all matching objects and return a list

>>> Re.findall ('o','Hello,world')['o','o']>>> Re.findall ('L','Hello,world')['L','L','L']

(2) Search: Find First, return value print with group ()

 >>> re.search ( " l   ", "  hello,world   " )  <_sre. Sre_match object; Span= (2, 3), Match= " l  "  >>>> ret = re.search ( " l  , "  hello, World   " )  >>> Ret.group () Span style= "COLOR: #008000" >#   use Group () to print   '  

(3) match: Match from start, return value is printed with group ()

>>> Ret1 = Re.match ('L','Hello,world')#must match the first character>>>Ret1.group () Traceback (most recent): File"<stdin>", Line 1,inch<module>Attributeerror:'Nonetype'object has no attribute'Group'>>> Ret1 = Re.match ('H','Hello,world')>>>Ret1.group ()'H'

(4) Split: segmentation based on regular expression method

>>> Ret2 = Re.split ('[AC]','ABCACD')>>>ret2["','b',"',"','D'] #in a split, return ' and ' BCACD#split the BCACD to C, back to a B#Split ACD, with a split, return an empty '#split the CD, split in C, return an empty ' ', the remaining one D cannot be split

It is more flexible to slice a string with regular expressions than to use a fixed character.

>>> Re.split ('\s','a b c')['a','b',"','C']>>> Re.split ('\s+','a b c')#a "+" must be added['a','b','C']>>> Re.split ('[\s,]','A, B, C')['a',"','b',"',"','C']>>> Re.split ('[\s,]+','A, B, C')#you can match multiple spaces with a plus sign. ['a','b','C']

(5) SUB/SUBN: Replace with regular expression method

>>> Re.sub ('\d','NUM','my old is')'my old is Numnum'>>> Re.subn ('\d','NUM','my old is')('my old is Numnum', 2)

(6) Compile: Compile regular, compile the object that will match in advance

>>> obj = re.compile ('123')>>> ret4 = Re.search (obj,' ABC123456CDA ' )>>> ret4.group ()'123'

(7) Finditer: Returns an iterator

>>> Ret5 = re.finditer ('l','hello,world') ' __next__ ' inch dir (ret5) True

Python Regular Expressions

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.