* Regular expression: is a way to quickly find the desired substring from a large pile of characters, function + expression = quickly find the substring.
in regular expressions, if a character is given directly, it is exactly the exact match. With \d can match a number, \w can match a letter or number, so:
' 00\d ' can match ' 007 ', but cannot match ' 00A ';
' \d\d\d ' can match ' 010 ';
' \w\w\d ' can match ' py3 ';
. Can match any character, so: ' py. ' Can match ' PYc ', ' pyo ', ' py! ' Wait a minute.
to match the variable length character, in the regular expression, with a * for any character (including 0), with + for at least one character, with a. for 0 or 1 characters, with {n} for n characters, {n,m} for n-m characters.
Greedy match: a regular match is a greedy match by default, which is to match as many characters as possible.
1. Regular expression Re.match () function: Prototype ==> match (regular expression, string to match, flags=0)
function: matches a pattern from the starting position of the string, and returns none if no match is reached.
2. Regular expression Re.search () function: Prototype ==> search (regular expression, string to match, flags=0)
Function: Scans the entire string and returns the first successful match.
3. Regular expression Re.findall () function: Prototype ==> findall (regular expression, string to match, flags=0)
Function: Scans the entire string and returns the entire list of results.
Flags: Flag bit, The following 6 values are used to control how regular expressions are matched:
Re. I: Ignore case
Re. L: Do localization recognition (Basic)
Re. M: Multiline match, affecting ^ and $
Re. S: make. Matches all characters, including line breaks,
Re. U: Resolves characters based on the Unicode character set, affecting \w \w \b \b
Re. X: Enables us to understand regular expressions in a more flexible format (also not commonly used)
^: matches are not preceded by a, if not return none
$: The match is not the end of a, if not then return none
[34578]: Take any one of 34578.
==> Grouping: In addition to simply judging if it matches, regular expressions also have the power to extract substrings. The group (group) to be extracted is represented by (). Like what:
>>>str = Re.match (R ' ^ (\d{3})-(\d{3,8}) $ ', ' 010-12345 ')
>>>str
<_sre. Sre_match object; span= (0, 9), match= ' 010-12345 ' >
>>> Str.group (0)
' 010-12345 '
>>> Str.group (1)
' 010 '
>>> Str.group (2)
' 12345 '
>>> str.groups ()
(' 010-12345 ', ' 010 ', ' 12345 ')
==> compile: If a regular expression is to be reused thousands of times, for efficiency, we can precompile the regular expression, and then reuse it does not need to compile this step, directly match.
>>>re_telephone = Re.compile (R ' ^ (\d{3})-(\d{3,8}) $ ')
>>> re_telephone.match (' 010-12345 '). Group (1)
' 010 '
>>> re_telephone.match (' 027-12345 '). Group (1)
' 027 '
Python-Regular Expressions