Regular expression pattern:
Except for the control character (+?. * ^ $ () [] {} | \), all characters match themselves. You can escape the control character by using the backslash before it.
The following table lists the available regular expression syntax in Python:
Mode |
Description |
^ |
The beginning of the match |
$ |
Match end of Line |
. |
Matches any single character except the line break. This is true with the-m option allowing it to match the line break. |
[...] |
Match any single character in parentheses |
[^...] |
Match any single character not in parentheses |
Tel |
Matches 0 or more occurrences of the preceding expression. |
Re+ |
Matches 1 or more previously occurring expressions. |
Re? |
Matches an expression that appears before 0 or 1. |
re{N} |
Exactly matches the number of n preceding expressions. |
re{N,} |
Matches n or more occurrences of the above expression. |
re{N, m} |
Matches at least n times and most of the preceding expressions appear m times. |
a| B |
Match A or B. |
(RE) |
Group regular expressions and remember the matching text. |
(? imx) |
Temporarily toggles the options on the I, M, or x regular expressions. If in parentheses, only the area is affected. |
(?-imx) |
Temporarily turn off the option to toggle I, M, or x regular expressions. If in parentheses, only the area is affected. |
(?: RE) |
The group regular expression does not match the remembered text. |
(? imx:re) |
Temporarily switch the options in the upper I, M, or x brackets. |
(?-imx:re) |
Temporarily turn off toggle I, M, or x brackets within the options. |
(?#...) |
Comments |
(? = re) |
Specifies the mode location used, without a range. |
(?! Re) |
Specifies the use mode to reverse the position without a range. |
(?> re) |
Match independent patterns without tracking backwards. |
\w |
Matches a word character. |
\w |
Match non-word characters |
\s |
Matched blanks, equivalent to [\ t\ñ\ r\ F] |
\s |
Match non-whitespace |
\d |
Matches the number. equivalent to [0-9] |
\d |
Match non-numeric |
\a |
Match the start of a string |
\z |
Matches the end of the string. If a newline character exists, it just matches before the line breaks |
\z |
Match the end of a string |
\g |
Match point, end of last match |
\b |
Matches the word boundary outside the parentheses. Match backspace (0x08), in parentheses |
\b |
Match non-word boundaries |
\ n, \ t, etc. |
Match line break, carriage return, tab, etc. |
\1...\9 |
A sub-expression that matches the nth grouping. |
\10 |
Matches if it already matches the sub-expression of nth grouping. Otherwise, it refers to the octal representation of a character code. |
Example literal characters for regular expressions:
Example |
Description |
Python |
Match "Python". |
Character class:
Example |
Description |
[Pp]ython |
Match "python" or "python" |
Rub[ye] |
Match "Ruby" or "Rube" |
[Aeiou] |
Match any one lowercase vowel |
[0-9] |
Match any number; if [0123456789] |
[A-z] |
Matches any lowercase ASCII letter |
[A-z] |
Matches any uppercase ASCII letter |
[A-za-z0-9] |
Match any of the above |
[^aeiou] |
Match any not lowercase vowels |
[^0-9] |
Match any non-numeric |
Special Character classes:
Example |
Description |
. |
Match any character except for a line break |
\d |
Match a number: [0-9] |
\d |
Match a non-numeric: [^0-9] |
\s |
Match a white space character: [\ t\ r\ñ\ F] |
\s |
Matching non-whitespace: [^\ t\ r\ñ\ F] |
\w |
Match a word character: [a-za-z0-9 _] |
\w |
Match non-word characters: [^ a-za-z0-9 _] |
Repeat case:
Example |
Description |
Ruby? |
Match "rub" or "Ruby": Y is optional |
ruby* |
Match "rub" plus 0 or more Ys |
ruby+ |
Match "rub" plus 1 or more YS |
\D{3} |
Match 3 numbers exactly |
\d{3,} |
Match 3 or more numbers exactly |
\d{3,5} |
Match a 3,4 or 5 digits |
Non-greedy repetition:
Match minimum number of repetitions:
Example |
Description |
<.*> |
Greedy repetition: matching "<python>perl>" |
<.*?> |
Non-greedy: Match "<python>" in "<python>perl>" |
Grouping using parentheses:
Example |
Description |
\d\d+ |
Do not group: + Repeat \d |
(\d\d) + |
Group: + Repeat \d\d pairs |
([Pp]ython (,)?) + |
Match "Python", "python, Python, Python," and so on. |
Reverse reference:
Match again?? Previously matched groups:
Example |
Description |
([Pp]) ython&\1ails |
Match Python&pails or Python&pails |
([‘"]) [^\1]*\1 |
Single or double quote string. \1 matches the first set of matches. \2 matches the second set of matches, and so on. |
Alternative scenarios:
Example |
Description |
Python|perl |
Match "Python" or "Perl" |
Rub (Y|le)) |
Match "Ruby" or "ruble" |
Python (!+|\?) |
"Python" followed by one or more! or one? |
Anchor Point:
Need to specify a matching location
Example |
Description |
^python |
Match "Python" at the beginning of a string or inner line |
python$ |
Match "Python" string or end of line |
\apython |
Match "Python" at the beginning of the string |
Python\z |
Match "Python" at the end of the string |
\bpython\b |
Match "Python" at word boundaries |
\brub\b |
\b Non-word boundaries: matches "rub", "Rube" and "Ruby", but not individually |
Python (? =!) |
Match "Python" if the following exclamation mark |
Python (?!!) |
Match "Python" if not followed by an exclamation point |
Special syntax with parentheses:
Example |
Description |
R (? #comment) |
Conforms to "R". All the rest are notes. |
R (? i) Uby |
Case insensitive when "Uby" is matched |
R (? i:uby) |
The same as above |
Rub (?: Y|le)) |
Only groups without creating a \1 reverse reference |
Python Learning Note 21 (regular expression)