NO1:
In regular expressions, if a character is given directly, it is exactly the exact match. To match a \d number, \w you can match a letter or a number, so:
‘00\d‘Can match ‘007‘ , but cannot match ‘00A‘ ;
‘\d\d\d‘can match ‘010‘ ;
‘\w\w\d‘can match ‘py3‘ ;
.Can match any character, so:
‘py.‘Can match ‘pyc‘ , ‘pyo‘ , and ‘py!‘ so on.
No2:
To match a variable-length character, in a regular expression, with a representation of * any character (including 0), with a representation of + at least one character, representing ? 0 or 1 characters, with a representation of {n} n characters, represented by {n,m} n-m characters:
Take a look at a complex example: \d{3}\s+\d{3,8} .
Let's read from left to right:
\d{3}Indicates a match of 3 digits, for example ‘010‘ ;
\sCan match a space (also including tab and other white space characters), so that \s+ there is at least one space, such as matching ‘ ‘ , ‘ ‘ etc.;
\d{3,8}Represents a 3-8 number, for example ‘1234567‘ .
Together, the above regular expression can match a telephone number with an area code separated by any space.
What if you want to match ‘010-12345‘ a number like this? Because ‘-‘ it is a special character, it is escaped in the regular expression, ‘\‘ so the above is \d{3}\-\d{3,8} .
No3:
To make a more accurate match, you can use a [] representation range, such as:
[0-9a-zA-Z\_]Can match a number, letter, or underscore;
[0-9a-zA-Z\_]+Can match a string of at least one number, letter, or underscore, for example, and ‘a100‘ ‘0_Z‘ ‘Py3000‘ so on;
[a-zA-Z\_][0-9a-zA-Z\_]*It can be matched by a letter or underscore, followed by a string consisting of a number, letter, or underscore, which is a valid Python variable;
[a-zA-Z\_][0-9a-zA-Z\_]{0, 19}More precisely limit the length of a variable to 1-20 characters (1 characters before + 19 characters later).
A|BCan match A or B, so (P|p)ython you can match ‘Python‘ or ‘python‘ .
^Represents the beginning of a row, ^\d indicating that a number must begin.
$Represents the end of a line, indicating that it \d$ must end with a number.
You may have noticed it, and you py can match it, ‘python‘ but plus ^py$ it turns into an entire line match, it only matches ‘py‘ the
No4:
>>> import re >>> Re.match (R ' ^\d{3}\-\d{3,8}$ ", " 010-12345 " Span style= "COLOR: #800000" > " ) <_sre. Sre_match object; span= (0, 9), Match= 010-12345 " >>>> Re.match (r" ^\d{3}\-\d{3,8}$ ", " 010 12345 " ) >>>
match()The method determines whether the match is successful, returns an object if the match succeeds, Match or returns None .
NO5:
Slicing a string
' a B c'. Split (') ['a'] b"" "C"
>>> re.split (r " \s+ ", " a b C Span style= "COLOR: #800000" > " ) [ " Span style= "COLOR: #800000" >a ", " Span style= "COLOR: #800000" >b ", " Span style= "COLOR: #800000" >c "]
>>> re.split (r " [\s\,]+ Span style= "COLOR: #800000", ", " a,b, C D " ) [ ' ]
>>> Re.split (R'[\s\,\;] +'b;; c D') ['a'] b'c'd']
NO6:
Group
Identify the legal time
' 19:05:30 '>>> m = re.match (R'^ (0[0-9]|1[0-9]|2[0-3]|[ 0-9]) \:(0[0-9]|1[0-9]|2[0-9]|3[0-9]|4[0-9]|5[0-9]| [0-9]) \:(0[0-9]|1[0-9]|2[0-9]|3[0-9]|4[0-9]|5[0-9]| [0-9]) $', T)>>> m.groups () (' + ') ("')"
For ‘2-30‘ , ‘4-31‘ such illegal date, with regular or can not be recognized, or write out to be very difficult, then need to program cooperation to identify
No7:
Greedy match
Non-greedy match
No8:
Precompiled Regular Expressions
"Python" Regular expression