Regular expressions that can fetch data
The match is greedy, and the match is more.
>>> Import re
>>> s= "I am years old"
>>> Re.search (r "\d+", s)
<_sre. Sre_match Object at 0x0000000002f63ed0>
>>> Re.search (r "\d+", s). Group ()
' 19 '
>>>
>>> s= "I am years 30"
>>> Re.findall (r "\d", s)
[' 1 ', ' 9 ', ' 2 ', ' 0 ', ' 3 ', ' 0 ']
>>> Re.findall (r "\d+", s)
[' 19 ', ' 20 ', ' 30 ']
Can match data, fetch data,
Determine if the sentence contains a specified string
Re.match (R "\d+", "123ABC") inside R preferably with, to prevent escape character effects
R "\d" match numbers
>>> Re.match (r "\d+", "123ABC")
<_sre. Sre_match Object at 0x0000000002f63ed0>
>>> Re.match ("\d+", "123ABC")
<_sre. Sre_match Object at 0x000000000306b030>
>>> Re.match ("\d+", "A123ABC")
>>> Re.match ("\d+", "A123ABC")
>>> Re.match ("\d+", "123ABC"). Group ()
' 123 '
>>> Re.match ("\d+", "123abc 1SD"). Group ()
' 123 '
Re.search (r "\d+") any position conforming to return Re.findall (r "\d+"), and any position matching will return Re.match (r "\d+"), starting from the first position of the string to match
>>> Re.match ("\d+", "123abc 1SD"). Group ()
' 123 '
>>> re.search ("\d+", "123abc 1SD"). Group ()
' 123 '
>>> re.search ("\d+", "A123abc 1SD"). Group ()
' 123 '
>>> Re.match ("\d+", "A123abc 1SD"). Group ()
Traceback (most recent):
File "<stdin>", line 1, in <module>
Attributeerror: ' Nonetype ' object has no attribute ' group '
>>> Re.match ("\d+", "A123abc 1SD")
>>> Re.findall (r "\d+", "A1B2C3")
[' 1 ', ' 2 ', ' 3 ']
>>> Re.match (r "\d+", "A1B2C3")
R "\d" matches non-numeric re.match (r "\d+", "a1b2c3") matches non-numeric
>>> Re.match (r "\d+", "A1B2C3"). Group ()
A
>>> Re.match (r "\d+", "ABC1B2C3"). Group ()
' ABC '
Re.match (R "\d+\d+", "AB12 BB"). Group () matches "AB12 BB", matching AB12
>>> Re.match (r "\d+\d+", "AB12 BB"). Group ()
' Ab12 '
R "\s" matches white space Re.match (r "\s+", "s") matches white space, match function starts matching from the first character of a string
>>> Re.match (r "\s+", "AB12 BB"). Group ()
Traceback (most recent):
File "<stdin>", line 1, in <module>
Attributeerror: ' Nonetype ' object has no attribute ' group '
>>> Re.match (r "\s+", "BB"). Group ()
R "\s" matches non-blank Re.match (r "\s+", "eee BB") non-whitespace until white space appears
>>> Re.match (r "\s+", "eee BB"). Group ()
' EE
>>> print ' A ' #str方法
A
>>> ' A ' #repr方法
A
R "\w+" matches numbers and letters Re.findall (r "\w+", "SDF") numbers and letters
>>> Re.findall (r "\w+", "SDF")
[' SDF ']
>>> Re.findall (r "\w+", "SDF")
[' A ', ' SDF ']
>>> Re.findall (r "\w+", "SDF we")
[' A ', ' SDF ', ' we ']
R "\w+" matches non-numeric and non-alphabetic Re.match (R "\w+", "SF FD"). Group () non-numeric and non-alphabetic
>>> Re.match (r "\w+", "SF FD"). Group ()
‘ ‘
>>> Re.match (r "\w+", "SF $% @# FD"). Group ()
‘ ‘
>>> Re.match (r "\w+", "$#% SF $% @# FD"). Group ()
' $#% '
Quantifiers
R "\w\w" Fetch two
>>> Re.match (r "\w\w", "ww we"). Group ()
' WW '
>>> Re.match (r "\w\w", "we"). Group ()
' 12 '
R "\w{2}" fetch two
>>> Re.match (r "\w{2}", "we"). Group ()
' 12 '
>>> Re.match (r "\w{2}", "123 we"). Group ()
' 12 '
>>> Re.match (r "\w{2}", "1"). Group ()
Traceback (most recent):
File "<stdin>", line 1, in <module>
Attributeerror: ' Nonetype ' object has no attribute ' group '
>>> Re.match (r "\w{2,4}", "1"). Group ()
Traceback (most recent):
File "<stdin>", line 1, in <module>
Attributeerror: ' Nonetype ' object has no attribute ' group '
R "\w{2,4}", Fetch 2 to 4, by most matches
>>> Re.match (r "\w{2,4}", "123 we"). Group ()
' 123 '
>>> Re.match (r "\w{2,4}", "we"). Group ()
' 12 '
>>> Re.match (r "\w{2,4}", "123 we"). Group ()
' 123 '
>>> Re.match (r "\w{2,4}", "1235 we"). Group ()
' 1235 '
>>> Re.match (r "\w{2,4}", "12435 we"). Group ()
' 1243 '
>>> Re.match (r "\w{5}", "12435 we"). Group ()
' 12435 '
>>> Re.match (r "\w{5}", "Srsdf we"). Group ()
' Srsdf '
>>>
R "\w{2,4}?" Suppress greed, by least match
>>> Re.match (r "\w{2,4}?", "12435 we"). Group ()
' 12 '
R "\w?" Match 0 times and once, if there is no match, return null
0 Times is also a match,
>>> Re.match (r "\w", "12435 we"). Group ()
' 1 '
>>> Re.match (r "\w", "12435 we"). Group ()
‘‘
Re.findall (r "\w?", "we") match to the last, nothing, 0 times, return empty
>>> Re.findall (r "\w", "we")
[' ', ' 1 ', ' 2 ', ', ' w ', ' e ', ']
Re.findall (R "\w", "we") matches a
>>> Re.findall (r "\w", "we") matches a
[' 1 ', ' 2 ', ' W ', ' e ']
R "\w*" matches 0 or more times
>>> Re.match (r "\w*", "12435 we"). Group ()
‘‘
>>> Re.match (r "\w*", "12435 we"). Group ()
' 12435 '
>>> Re.match (r "\w*", "12435 we")
<_sre. Sre_match Object at 0x0000000002f63ed0>
>>> Re.match (r "\w*", "12435 we")
<_sre. Sre_match Object at 0x000000000306c030>
>>> Re.match (r "\w*", "12435 we"). Group ()
' 12435 '
R "a.b" matches any character except carriage return between AB
If you want to match "A.B", use R "a\.b"
>>> Re.match (r "a.b", "AXB")
<_sre. Sre_match Object at 0x0000000001dcc510>
>>> Re.match (r "a.b", "A\NB")
>>> Re.match (r "a\.b", "a.b")
<_sre. Sre_match Object at 0x00000000022be4a8>
>>> Re.match (r "a\.b", "AXB")
>>> Re.match (R ".", "AXB"). Group ()
A
>>> Re.match (r "a.b", "AXB"). Group ()
' Axb '
Practice matching this IP address, the rule is there are 3. and 4 segments, numbers from 0 to 255, you can
S= "I find a ip:1.2.22.123! Yes
>>>re.search (r "\d{1,3}\.\d{1,3}\.\d{1,3}.\d{1,3}", s). Group ()
' 1.2.22.123 '
Re.search (R "(\d{1,3}\.) {3}\d{1,3} ", s) group match
>>>re.search (R "(\d{1,3}\.) {3}\d{1,3} ", s). Group ()
' 1.2.22.123 '
>>> Re.search (R "(\d{1,3}\.) {3}\d{1,3} ", s). Group ()
' 1.2.22.123
Match related formats for multiple characters
Character function
* Match the previous characters appear 0 times or the limit of the time, you can have?
+ 1 occurrences of pre-match characters or a limit of 1 times
? Matches the previous character 1 or 0 times, that is, either 1 times or no
{m} matches previous characters appear m times
{m,} matches the previous character? Less than m times
{M,n} matches before a character appears from M to N times
Character function
. Matches any of 1 characters (except \ n)
[] matches the characters enumerated in []
\d match number, i.e. 0-9
\d match? Numbers, i.e. not numbers
\s match empty, that is, the Space, TAB key
\s match? Empty?
\w matches word characters, i.e. A-Z, A-Z, 0-9, _
\w match? Word character
python-Regular Expressions