Regular expressions are used to concisely express a set of strings of expressions, this article is mainly to share with you the Python expression of the knowledge of the detailed, hope to help everyone.
operator |
Description |
Example |
. |
Represents any single character |
|
[ ] |
Character set, single character value range |
[abc] Denotes a or B or C; [A-z] represents a to Z single character |
[^ ] |
Non-character set, single character exclusion range |
[^abc] Represents non-A or non-B or non-C |
* |
0 or unlimited expansion of the previous character |
abc* Saidab、abc、abcc、abccc... |
+ |
1 or unlimited expansion of the previous character |
abc+ Saidabc、abcc、abccc... |
? |
0 or 1 expansions of the previous character |
abc? Saidab、abc |
| |
Left or right means any one |
abc|def Saidabc或def |
{m} |
The M-time extension of the previous character |
ab{2} Saidabcc |
{M,n} |
m to n expansions of the previous character (with N) |
ab{1,2} Saidabc、abcc |
^ |
Match string start |
^abc Represents ABC and at the beginning of a string |
$ |
Match string End |
abc$ Represents ABC and at the end of a string |
( ) |
Group tag, internal only using | operator |
(abc|def) Saidabc或def |
\d |
number, equivalent to [0-9] |
|
\w |
Word character, equivalent to [a-za-z0-9_] |
|
If you are familiar with the above operator, the following example is not difficult.
1. Enter only the number: ^[0-9]*$
2. Only n digits can be entered: ^\d{n}$
3. You can only enter numbers with at least n digits: ^\d{n,}$
4. Enter only m~n digits: ^\d{m,n}$
5. Only numbers starting with 0 and non 0 can be entered: ^ (0|[ 1-9][0-9]*) $
6. Only positive real numbers with two decimal places can be entered: ^[0-9]+ (. [ 0-9]{2})? $
7. You can only enter positive real numbers with decimal places: ^[0-9]+ (. [ 0-9]{1,3})? $
8. Only non-zero positive integers can be entered: ^+? [1-9] [0-9]*$
"Python3 Regular expression"
function |
Description |
Re.match () |
Matches a pattern from the starting position of the string, and if the start position match is unsuccessful, match () returns none. |
Re.search () |
Scans the entire string and returns the first successful match. |
Re.sub () |
A substring used to replace all matching regular expressions in a string, returning the replaced string |
Re.findall () |
Search string to return all matching substrings as a list |
Re.split () |
Cuts the string according to the regular expression match result, returns the list |
Re.finditer () |
Searches for a string that returns the iteration type of a matching result, where each iteration element is a match object |
>>> match= Re.findall (R ' [1-9]\d{5} ', ' 100081BIT BIT10008676 ') >>> print (match) [' 100081 ', ' 100086 ']>>> match = Re.split (R ' [1-9]\d{5} ', ' 100081BIT BIT10008676 ') >>> match["', ' BIT BIT ', ']>>> ' match = Re.split (R ' [1-9]\d{5} ', ' 100081BIT BIT10008676 ', maxsplit=1) >>> match[' , ' BIT BIT10008676 ']>>>for m in Re.finditer (R ' [1-9]\d{5} ', ' 100081BIT BIT10008676 '): if M: Print (M.group (0)) 100081100086
The difference between Re.match and Re.search
Re.match matches only the beginning of the string, if the string does not begin to conform to the regular expression, the match fails, the function returns none, and Re.search matches the entire string until a match is found.
operator |
Description |
Example |
. |
Represents any single character |
|
[ ] |
Character set, single character value range |
[abc] Denotes a or B or C; [A-z] represents a to Z single character |
[^ ] |
Non-character set, single character exclusion range |
[^abc] Represents non-A or non-B or non-C |
* |
0 or unlimited expansion of the previous character |
abc* Saidab、abc、abcc、abccc... |
+ |
1 or unlimited expansion of the previous character |
abc+ Saidabc、abcc、abccc... |
? |
0 or 1 expansions of the previous character |
abc? Saidab、abc |
| |
Left or right means any one |
abc|def Saidabc或def |
{m} |
The M-time extension of the previous character |
ab{2} Saidabcc |
{M,n} |
m to n expansions of the previous character (with N) |
ab{1,2} Saidabc、abcc |
^ |
Match string start |
^abc Represents ABC and at the beginning of a string |
$ |
Match string End |
abc$ Represents ABC and at the end of a string |
( ) |
Group tag, internal only using | operator |
(abc|def) Saidabc或def |
\d |
number, equivalent to [0-9] |
|
\w |
Word character, equivalent to [a-za-z0-9_] |
|
If you are familiar with the above operator, the following example is not difficult.
1. Enter only the number: ^[0-9]*$
2. Only n digits can be entered: ^\d{n}$
3. You can only enter numbers with at least n digits: ^\d{n,}$
4. Enter only m~n digits: ^\d{m,n}$
5. Only numbers starting with 0 and non 0 can be entered: ^ (0|[ 1-9][0-9]*) $
6. Only positive real numbers with two decimal places can be entered: ^[0-9]+ (. [ 0-9]{2})? $
7. You can only enter positive real numbers with decimal places: ^[0-9]+ (. [ 0-9]{1,3})? $
8. Only non-zero positive integers can be entered: ^+? [1-9] [0-9]*$
"Python3 Regular expression"
functions |
description |
re.match () The |
matches a pattern from the starting position of the string, and if the start position match is unsuccessful, match () returns none. |
re.search () |
scans the entire string and returns the first successful match. |
re.sub () |
replaces substrings of all matching regular expressions in a string, returning the replaced string |
re.findall () |
search string, return all matching substrings as a list |
re.split () |
cut the string to match the regular expression results, return to the list |
re.finditer () |
searches for a string that returns the iteration type of a matching result, where each iteration element is a match object |
>>> match= Re.findall (R ' [1-9]\d{5} ', ' 100081BIT BIT10008676 ') >>> print (match) [' 100081 ', ' 100086 ']>>> match = Re.split (R ' [1-9]\d{5} ', ' 100081BIT BIT10008676 ') >>> match["', ' BIT BIT ', ']>>> ' match = Re.split (R ' [1-9]\d{5} ', ' 100081BIT BIT10008676 ', maxsplit=1) >>> match[' , ' BIT BIT10008676 ']>>>for m in Re.finditer (R ' [1-9]\d{5} ', ' 100081BIT BIT10008676 '): if M: Print (M.group (0)) 100081100086
The difference between Re.match and Re.search
Re.match matches only the beginning of the string, if the string does not begin to conform to the regular expression, the match fails, the function returns none, and Re.search matches the entire string until a match is found.