Python re module detailed

Last Update:2017-01-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞



Match number correlation
‘.‘ Default match any character except \ n, if flag Dotall is specified, matches any character, including line break
The ' ^ ' matches the beginning of the character, and if you specify the flags MULTILINE, this can also be matched on (r "^a", "\nabc\neee", Flags=re. MULTILINE)
' $ ' matches the end of the character, or E.search ("foo$", "BFOO\NSDFSF", Flags=re. MULTILINE). Group () can also
' * ' matches the character before the * number 0 or more times, Re.findall ("ab*", "Cabb3abcbbac") results for [' ABB ', ' ab ', ' a ']
' + ' matches the previous character 1 or more times, Re.findall ("ab+", "Ab+cd+abb+bba") results [' AB ', ' ABB ']
‘?‘ Match a previous character 1 or 0 times
' {m} ' matches the previous character m times
' {n,m} ' matches the previous character N to M times, Re.findall ("ab{1,3}", "ABB ABC abbcbbb") Results ' ABB ', ' AB ', ' ABB ']
| Match | left or | Right character, re.search ("abc| ABC "," ABCBABCCD "). Group () result ' ABC '
' (...) ' Group match, Re.search ("(ABC) {2}A (123|456) C", "abcabca456c"). Group () Results abcabca456c

The ' \a ' effect and ^ are the same, only match from the beginning of the character, Re.search ("\aabc", "ALEXABC") is not matched
' \z ' matches the end of the character, same as $
' \d ' matches the number 0-9
' \d ' matches non-numeric
' \w ' match [a-za-z0-9]
' \w ' matches non-[a-za-z0-9]
' s ' matches whitespace characters, \ t, \ n, \ r, Re.search ("\s+", "Ab\tc1\n3"). Group () result ' \ t '
‘(? P<name&gt, ...) ' Group Matching Re.search (? P<province>[0-9]{4}) (? P<city>[0-9]{2}) (? P&LT;BIRTHDAY&GT;[0-9]{4}) "," 371481199306143242 "). Groupdict (" city ") result {' Province ': ' 3714 ', ' City ': ' Bayi ', ' birthday ' : ' 1993 '}
Attention:? P is a fixed syntax format



Note that there are several ways to re:
The match method is matched from the beginning of the string (with less)
Cases:

res = Re.match (' ^chen ', ' Chenronghua123 ')  syntax: pattern,string
Print (RES)
#输出: <_sre. Sre_match object; Span= (0, 4), match= ' Chen ' >

#res = Re.match (' r.+ ', ' chen123ronghua123 ')  #匹配结果为空, match starts at the beginning of the string
# res = re.search (' r.+ ', ' chen123ronghua123 ')  #search search from entire text
# Print (Res.group ())
# Result: Ronghua


Commonly used in the following four kinds:
1.search is searched from the entire text, matched to a return
2.findall is search from the whole text, greedy match, if match to multiple return all, FindAll No group method
3.split Separation method
4.sub Replacement method

Only a few matching patterns to be known lightly
1.re. I (re. IGNORECASE): Ignore case (full notation in parentheses, same as below)
2.M (MULTILINE): Multiline mode, changing the behavior of ' ^ ' and ' $ ' (see) [rarely used]
3.S (dotall): Point any match mode, change '. ' The behavior




Split method:
res = Re.split (' [0-9]+ ', ' Abc12de3f45gh ')
Print (RES)
Output: [' abc ', ' De ', ' f ', ' GH ']

Sub method:
res = Re.sub (' [0-9]+ ', ' | ', ' abc12de3f45gh ', count=2)
Print (RES)
Output: Abc|de|f45gh


1.re. I (re. IGNORECASE): Ignore case
res = Re.search (' [a-z]+ ', ' abcgh ', flags=re. I)
Print (Res.group ())
Output: ABCGH

2.M (MULTILINE): Multiline mode, change the behavior of ' ^ ' and ' $ '
res = Re.search (r "^a", "\nabc\neee", Flags=re. M
Print (Res.group ())
Output: A

3.S (dotall): Point any match mode, change '. ' The behavior
res = Re.search (". +", "\nabc\neee", Flags=re. S
Print (Res.group ())
Output: A


Example:
‘.‘ Default match any character except \ n, if flag Dotall is specified, matches any character, including line break

res = Re.match ('. + ', ' chen123ronghua123 ')
Print (Res.group ())
Output:
Chen123ronghua123




' $ ' matches the end of the character, or E.search ("foo$", "BFOO\NSDFSF", Flags=re. MULTILINE). Group () can also

res = Re.match (' r.+ ', ' chen123ronghua123 ') #匹配结果为空, match starts at the beginning of the string
res = Re.search (' r.+ ', ' chen123ronghua123 ') #search search from entire text
Print (Res.group ())
Results: Ronghua



' + ' matches the previous character 1 or more times, Re.findall ("ab+", "Ab+cd+abb+bba") results [' AB ', ' ABB ']
res = Re.search (' r[a-z]+a ', ' chen123ronghua123 ') #匹配ronghua
Print (Res.group ())
Results: Ronghua

res = Re.search (' #.+# ', ' 1123#hello# ')
Print (Res.group ())
Results: #hello #



‘?‘ Match a previous character 1 or 0 times
Res0 = Re.search (' Aal? ', ' Aalex ')
Res1 = Re.search (' Aal? ', ' Aaex ')
Print (Res0.group ())
Print (Res1.group ())
Output
Ca.
Aa



' {m} ' matches the previous character m times
res = Re.search (' [0-9]{3} ', ' Aa1xe2pp345lex ') #匹配前面的数字三次
Print (Res.group ())

' {n,m} ' matches the previous character N to M times
res = Re.search (' [0-9]{1,3} ', ' Aa1xe2pp345lex ') #匹配前面的数字1到3次
Print (Res.group ())
Output 1

FindAll Greedy Match
res = Re.findall (' [0-9]{1,3} ', ' Aa1xe2pp345lex ') #findall, greedy match, matches the preceding number 1 to 3 times
Print (RES)
Output [' 1 ', ' 2 ', ' 345 '] #以列表的形式返回


| Match | left or | Right character, re.search ("abc| ABC "," ABCBABCCD "). Group () result ' ABC '

res = Re.search (' abc| ABC ', ' ABCBABCCD ')
Print (Res.group ())
Output ABC

res = Re.findall (' abc| ABC ', ' ABCBABCCD ')
Print (RES)
Output [' ABC ', ' ABC ']




' (...) ' Group match, Re.search ("(ABC) {2}A (123|456) C", "abcabca456c"). Group () Results abcabca456c

res = Re.search (' (ABC) {2} ', ' ALEXABCABC ')
Print (Res.group ())
Output ABCABC

res = Re.search (' (ABC) {2} (\|\|=) {2} ', ' alexabcabc| | =|| = ') match | | = two times, note the need to escape
Print (Res.group ())
Output: abcabc| | =|| =






' \d ' matches non-numeric
res = Re.search (' \d+ ', ' 123$-a ')
Print (Res.group ())
Output: $-A



' \w ' matches [a-za-z0-9] except for special characters

res = Re.search (' \w+ ', ' 123$-a ')
Print (Res.group ())
Output: 123

' \w ' matches non-[a-za-z0-9] matches only special characters
res = Re.search (' \w+ ', ' 123$-... a ')
Print (Res.group ())
Output: $-...



' \s ' matches whitespace characters, \ t, \ n, \ r, Re.search ("\s+", "Ab\tc1\n3"). Group () result ' \ t '
res = Re.findall (' \s ', ' 123$-\r\n\t...a ')
Print (RES)
Output: [', ' \ R ', ' \ n ', ' \ t ']

>>> re.search (' \s+ ', ' 123$-\ r \ n ')
<_sre. Sre_match object; Span= (5, 9), match= ' \t\r\n ' >




The ' \a ' effect and ^ are the same, only match from the beginning of the character, Re.search ("\aabc", "ALEXABC") is not matched
' \z ' matches the end of the character, same as $
' \d ' matches the number 0-9

Cases:
res = Re.search (' \a[0-9]+[a-z]\z ', ' 123a ')
Print (Res.group ())
Output: 123a


*: 0 to multiple
+: 1 to multiple

res = Re.match (' ^chen\d+ ', ' chen123ronghua123 ')
Print (RES)
Print (Res.group ()) #查看匹配到的对象

Output: <_sre. Sre_match object; span= (0, 7), match= ' Chen123 ' >
Chen123




‘(? P<name&gt, ...) ' Group Matching
res = Re.search (? P<province>[0-9]{4}) (? P<city>[0-9]{2}) (? P&LT;BIRTHDAY&GT;[0-9]{4}) "," 371481199306143242 "). Groupdict (" City ")
Print (RES)
Result {' Province ': ' 3714 ', ' City ': ' Bayi ', ' Birthday ': ' 1993 '}

Python re module detailed

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python re module detailed

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python re module detailed

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support