Regular expressions are not part of Python. Regular expressions are powerful tools for working with strings, with their own unique syntax and an independent processing engine, which may not be as efficient as Str's own approach, but very powerful.
Thanks to this, the syntax for regular expressions is the same in the language that provides regular expressions, except that the number of grammars supported by different programming languages differs.
The approximate matching process for regular expressions is to take out the expression and string comparisons in the text, and if each character matches successfully, the match succeeds and the match fails if there is a string that matches unsuccessfully.
The following is a list of the regular expression metacharacters and syntax supported by Python.
Character
Character |
Description |
Matching objects |
Match Results |
General characters |
Match itself |
Abc |
Abc |
. |
Any character other than line break |
A.c |
Abc/acc/a2c |
Escape character |
Escape character, so that the latter character changes the original meaning. For example, Match *, you can \* |
A\*c |
A*c |
[] |
Character. Matches any one of the characters in the character set. can be abbreviated, For example [0-9], [A-z], there are some special characters, such as [^ ...], in addition to ... Outside [*] Match * This character, in the character set the special characters have lost the original meaning. |
A[1-9e-g]c |
A3c/agc |
Pre-defined character set
Character |
Description |
Matching objects |
Match Results |
\s |
White space characters |
A\sc |
A C |
\s |
Non-whitespace characters |
A\sc |
A1C or ABC |
\d |
numeric characters |
A\dc |
A2c |
\d |
Non-numeric characters |
A\dc |
Adc |
\w |
Numbers, letters, underscores |
A\wc |
A_c |
\w |
Non-alphanumeric underline |
A\wc |
A C |
\ n |
Line break |
|
|
\ t |
Tabs |
|
|
Quantity words
Character |
Description |
Matching objects |
Match Results |
* |
Match 0 or more arbitrary characters |
Lee. * |
Li/li Yang/li Jie silly ... |
? |
Match 0 or 1 arbitrary characters |
Li.? |
Lee/Li Jie |
+ |
Match 1 or more arbitrary characters |
Li. + |
Li Yang/li Jie silly ... |
{m} |
Match m |
Lee {3} |
Li Li |
{M,}/{,n} |
Match 0-n or greater than M |
Lee {2,} |
Li Li/Lilili/Lee Li Lili. |
{M,n} |
Match m to n characters |
Lee {3,5} |
Li Li/Lilili/Lee Li Lili. |
Boundary matching
Character |
Description |
Matching objects |
Match Results |
^ |
Start with what? |
|
|
$ |
At what end? |
|
|
\a |
Match string start only |
|
|
\z |
Match string End only |
|
|
The methods commonly used in re modules
(1) FindAll: Find all matching objects and return a list
>>> Re.findall ('o','Hello,world')['o','o']>>> Re.findall ('L','Hello,world')['L','L','L']
(2) Search: Find First, return value print with group ()
>>> re.search ( " l ", " hello,world " ) <_sre. Sre_match object; Span= (2, 3), Match= " l " >>>> ret = re.search ( " l , " hello, World " ) >>> Ret.group () Span style= "COLOR: #008000" ># use Group () to print '
(3) match: Match from start, return value is printed with group ()
>>> Ret1 = Re.match ('L','Hello,world')#must match the first character>>>Ret1.group () Traceback (most recent): File"<stdin>", Line 1,inch<module>Attributeerror:'Nonetype'object has no attribute'Group'>>> Ret1 = Re.match ('H','Hello,world')>>>Ret1.group ()'H'
(4) Split: segmentation based on regular expression method
>>> Ret2 = Re.split ('[AC]','ABCACD')>>>ret2["','b',"',"','D'] #in a split, return ' and ' BCACD#split the BCACD to C, back to a B#Split ACD, with a split, return an empty '#split the CD, split in C, return an empty ' ', the remaining one D cannot be split
It is more flexible to slice a string with regular expressions than to use a fixed character.
>>> Re.split ('\s','a b c')['a','b',"','C']>>> Re.split ('\s+','a b c')#a "+" must be added['a','b','C']>>> Re.split ('[\s,]','A, B, C')['a',"','b',"',"','C']>>> Re.split ('[\s,]+','A, B, C')#you can match multiple spaces with a plus sign. ['a','b','C']
(5) SUB/SUBN: Replace with regular expression method
>>> Re.sub ('\d','NUM','my old is')'my old is Numnum'>>> Re.subn ('\d','NUM','my old is')('my old is Numnum', 2)
(6) Compile: Compile regular, compile the object that will match in advance
>>> obj = re.compile ('123')>>> ret4 = Re.search (obj,' ABC123456CDA ' )>>> ret4.group ()'123'
(7) Finditer: Returns an iterator
>>> Ret5 = re.finditer ('l','hello,world') ' __next__ ' inch dir (ret5) True
Python Regular Expressions