Match number correlation
‘.‘ Default match any character except \ n, if flag Dotall is specified, matches any character, including line break
The ' ^ ' matches the beginning of the character, and if you specify the flags MULTILINE, this can also be matched on (r "^a", "\nabc\neee", Flags=re. MULTILINE)
' $ ' matches the end of the character, or E.search ("foo$", "BFOO\NSDFSF", Flags=re. MULTILINE). Group () can also
' * ' matches the character before the * number 0 or more times, Re.findall ("ab*", "Cabb3abcbbac") results for [' ABB ', ' ab ', ' a ']
' + ' matches the previous character 1 or more times, Re.findall ("ab+", "Ab+cd+abb+bba") results [' AB ', ' ABB ']
‘?‘ Match a previous character 1 or 0 times
' {m} ' matches the previous character m times
' {n,m} ' matches the previous character N to M times, Re.findall ("ab{1,3}", "ABB ABC abbcbbb") Results ' ABB ', ' AB ', ' ABB ']
| Match | left or | Right character, re.search ("abc| ABC "," ABCBABCCD "). Group () result ' ABC '
' (...) ' Group match, Re.search ("(ABC) {2}A (123|456) C", "abcabca456c"). Group () Results abcabca456c
The ' \a ' effect and ^ are the same, only match from the beginning of the character, Re.search ("\aabc", "ALEXABC") is not matched
' \z ' matches the end of the character, same as $
' \d ' matches the number 0-9
' \d ' matches non-numeric
' \w ' match [a-za-z0-9]
' \w ' matches non-[a-za-z0-9]
' s ' matches whitespace characters, \ t, \ n, \ r, Re.search ("\s+", "Ab\tc1\n3"). Group () result ' \ t '
‘(? P<name>, ...) ' Group Matching Re.search (? P<province>[0-9]{4}) (? P<city>[0-9]{2}) (? P<BIRTHDAY>[0-9]{4}) "," 371481199306143242 "). Groupdict (" city ") result {' Province ': ' 3714 ', ' City ': ' Bayi ', ' birthday ' : ' 1993 '}
Attention:? P is a fixed syntax format
Note that there are several ways to re:
The match method is matched from the beginning of the string (with less)
Cases:
res = Re.match (' ^chen ', ' Chenronghua123 ') syntax: pattern,string
Print (RES)
#输出: <_sre. Sre_match object; Span= (0, 4), match= ' Chen ' >
#res = Re.match (' r.+ ', ' chen123ronghua123 ') #匹配结果为空, match starts at the beginning of the string
# res = re.search (' r.+ ', ' chen123ronghua123 ') #search search from entire text
# Print (Res.group ())
# Result: Ronghua
Commonly used in the following four kinds:
1.search is searched from the entire text, matched to a return
2.findall is search from the whole text, greedy match, if match to multiple return all, FindAll No group method
3.split Separation method
4.sub Replacement method
Only a few matching patterns to be known lightly
1.re. I (re. IGNORECASE): Ignore case (full notation in parentheses, same as below)
2.M (MULTILINE): Multiline mode, changing the behavior of ' ^ ' and ' $ ' (see) [rarely used]
3.S (dotall): Point any match mode, change '. ' The behavior
Split method:
res = Re.split (' [0-9]+ ', ' Abc12de3f45gh ')
Print (RES)
Output: [' abc ', ' De ', ' f ', ' GH ']
Sub method:
res = Re.sub (' [0-9]+ ', ' | ', ' abc12de3f45gh ', count=2)
Print (RES)
Output: Abc|de|f45gh
1.re. I (re. IGNORECASE): Ignore case
res = Re.search (' [a-z]+ ', ' abcgh ', flags=re. I)
Print (Res.group ())
Output: ABCGH
2.M (MULTILINE): Multiline mode, change the behavior of ' ^ ' and ' $ '
res = Re.search (r "^a", "\nabc\neee", Flags=re. M
Print (Res.group ())
Output: A
3.S (dotall): Point any match mode, change '. ' The behavior
res = Re.search (". +", "\nabc\neee", Flags=re. S
Print (Res.group ())
Output: A
Example:
‘.‘ Default match any character except \ n, if flag Dotall is specified, matches any character, including line break
res = Re.match ('. + ', ' chen123ronghua123 ')
Print (Res.group ())
Output:
Chen123ronghua123
' $ ' matches the end of the character, or E.search ("foo$", "BFOO\NSDFSF", Flags=re. MULTILINE). Group () can also
res = Re.match (' r.+ ', ' chen123ronghua123 ') #匹配结果为空, match starts at the beginning of the string
res = Re.search (' r.+ ', ' chen123ronghua123 ') #search search from entire text
Print (Res.group ())
Results: Ronghua
' + ' matches the previous character 1 or more times, Re.findall ("ab+", "Ab+cd+abb+bba") results [' AB ', ' ABB ']
res = Re.search (' r[a-z]+a ', ' chen123ronghua123 ') #匹配ronghua
Print (Res.group ())
Results: Ronghua
res = Re.search (' #.+# ', ' 1123#hello# ')
Print (Res.group ())
Results: #hello #
‘?‘ Match a previous character 1 or 0 times
Res0 = Re.search (' Aal? ', ' Aalex ')
Res1 = Re.search (' Aal? ', ' Aaex ')
Print (Res0.group ())
Print (Res1.group ())
Output
Ca.
Aa
' {m} ' matches the previous character m times
res = Re.search (' [0-9]{3} ', ' Aa1xe2pp345lex ') #匹配前面的数字三次
Print (Res.group ())
' {n,m} ' matches the previous character N to M times
res = Re.search (' [0-9]{1,3} ', ' Aa1xe2pp345lex ') #匹配前面的数字1到3次
Print (Res.group ())
Output 1
FindAll Greedy Match
res = Re.findall (' [0-9]{1,3} ', ' Aa1xe2pp345lex ') #findall, greedy match, matches the preceding number 1 to 3 times
Print (RES)
Output [' 1 ', ' 2 ', ' 345 '] #以列表的形式返回
| Match | left or | Right character, re.search ("abc| ABC "," ABCBABCCD "). Group () result ' ABC '
res = Re.search (' abc| ABC ', ' ABCBABCCD ')
Print (Res.group ())
Output ABC
res = Re.findall (' abc| ABC ', ' ABCBABCCD ')
Print (RES)
Output [' ABC ', ' ABC ']
' (...) ' Group match, Re.search ("(ABC) {2}A (123|456) C", "abcabca456c"). Group () Results abcabca456c
res = Re.search (' (ABC) {2} ', ' ALEXABCABC ')
Print (Res.group ())
Output ABCABC
res = Re.search (' (ABC) {2} (\|\|=) {2} ', ' alexabcabc| | =|| = ') match | | = two times, note the need to escape
Print (Res.group ())
Output: abcabc| | =|| =
' \d ' matches non-numeric
res = Re.search (' \d+ ', ' 123$-a ')
Print (Res.group ())
Output: $-A
' \w ' matches [a-za-z0-9] except for special characters
res = Re.search (' \w+ ', ' 123$-a ')
Print (Res.group ())
Output: 123
' \w ' matches non-[a-za-z0-9] matches only special characters
res = Re.search (' \w+ ', ' 123$-... a ')
Print (Res.group ())
Output: $-...
' \s ' matches whitespace characters, \ t, \ n, \ r, Re.search ("\s+", "Ab\tc1\n3"). Group () result ' \ t '
res = Re.findall (' \s ', ' 123$-\r\n\t...a ')
Print (RES)
Output: [', ' \ R ', ' \ n ', ' \ t ']
>>> re.search (' \s+ ', ' 123$-\ r \ n ')
<_sre. Sre_match object; Span= (5, 9), match= ' \t\r\n ' >
The ' \a ' effect and ^ are the same, only match from the beginning of the character, Re.search ("\aabc", "ALEXABC") is not matched
' \z ' matches the end of the character, same as $
' \d ' matches the number 0-9
Cases:
res = Re.search (' \a[0-9]+[a-z]\z ', ' 123a ')
Print (Res.group ())
Output: 123a
*: 0 to multiple
+: 1 to multiple
res = Re.match (' ^chen\d+ ', ' chen123ronghua123 ')
Print (RES)
Print (Res.group ()) #查看匹配到的对象
Output: <_sre. Sre_match object; span= (0, 7), match= ' Chen123 ' >
Chen123
‘(? P<name>, ...) ' Group Matching
res = Re.search (? P<province>[0-9]{4}) (? P<city>[0-9]{2}) (? P<BIRTHDAY>[0-9]{4}) "," 371481199306143242 "). Groupdict (" City ")
Print (RES)
Result {' Province ': ' 3714 ', ' City ': ' Bayi ', ' Birthday ': ' 1993 '}
Python re module detailed