I. INTRODUCTION
A regular expression (or RE) is a small, highly specialized programming language (in Python) that is embedded in Python and implemented through the RE module and then compiled into a series of bytecode that is then executed by a matching engine written in C
Two. Common operation
1. Character Matching
| . |
Substituting any character except a newline character, which can match only one of the characters |
| ^ |
Match beginning |
| $ |
Match end of Line |
| * |
Repeat match 0~ multiple times, greedy match |
| + |
Repeat match 1~ multiple times, at least one match |
| ? |
Match 0 or one time, optional, up to only one |
| {} |
Determine the number of matches, or range |
| [] |
Commonly used to develop a character set, such as [AB] matches a or b; other metacharacters do not work in [] except for "-" "^" |
| | |
Or |
| () |
Grouping The first group number is 1 |
| \ |
Translated |
| \d |
Match any decimal number; it is equivalent to class [0-9]. |
| \d |
Matches any non-numeric character; it is equivalent to a class [^0-9] |
| \s |
Match any whitespace character; it is equivalent to class [\t\n\r\f\v] |
| \s |
Matches any non-whitespace character; it is equivalent to a class [^ \t\n\r\f\v] |
| \w |
Match any alphanumeric character; it is equivalent to a class [a-za-z0-9_] |
| \w |
Matches any non-alphanumeric character; it is equivalent to a class [^a-za-z0-9_] |
| \b |
Match a word boundary, that is, the position between the word and the space |
The regular expression of Python