Reference official website: Regular expression operations
Re:regular expression, shorthand: regex
- Regular expression rules: Version: v2.3.5 (2017-6-12) Deerchao; Http://deerchao.net/tutorials/regex/regex.htm
-------------------------------------------------------------------------------------
- The function of regular expressions: The primary function of regular expressions (regular expression) is to search for what you want to find by using a specific pattern (pattern) from a string.
-------------------------------------------------------------------------------------
re
Common functions:
re.compile(pattern, flags)
Converts the pattern of a regular expression into a regular expression object
Compile a regular expression pattern into a regular expression object, which can is used for matching using match()
it, and other methods, described below.
prog = re.compile(pattern)result = prog.match(string)
is equivalent to
result = re.match(pattern, string)
-------------------------------------------------------------------------------------
-
Re.search (pattern, string, flags = 0)
finds the first occurrence of the pattern in string
Scan through string looking for the first location where the regular expression pattern produces a match, and return a Corresponding match object. return None
if no position in the string matches the pattern; Note that this is different from finding a zero-length match at some point in the string.
-------------------------------------------------------------------------------------
-
Re.match (pattern, string, flags = 0)
Match pattern at the beginning of a string, not like search ()
any match
if zero or more characters at the beginning of string match the regular expression&nb Sp pattern , return a Corresponding match object. return None
if The string does not match the pattern; Note that this is different from a zero-length match.
Note that even in MULTILINE
mode, re.match ()
will-match at the Beginning of the string and not on the beginning of each line.
If you want to locate a match anywhere in string , use search ()
instead (see Also search () vs. match ()).
-------------------------------------------------------------------------------------
re.split(pattern, string, flags = 0)
The string is segmented by the pattern that matches in the string, and if the pattern uses parentheses, the found pattern is returned together;
As shown below, ‘\W+‘
match 1 or more characters that are not letters, numbers, underscores , and then match the comma and ,
the following space, so the comma and space for the bounds of the partition, the second example with parentheses, the match to the comma and the space is also returned.
>>> re.split(r‘\W+‘, ‘Words, words, words.‘)[‘Words‘, ‘words‘, ‘words‘, ‘‘]>>> re.split(r‘(\W+)‘, ‘Words, words, words.‘)[‘Words‘, ‘, ‘, ‘words‘, ‘, ‘, ‘words‘, ‘.‘, ‘‘]>>> re.split(r‘\W+‘, ‘Words, words, words.‘, 1)[‘Words‘, ‘words, words.‘]>>> re.split(‘[a-f]+‘, ‘0a3B9‘, flags=re.IGNORECASE)[‘0‘, ‘3‘, ‘9‘]
If There is capturing groups in the separator and it matches at the start of the string, the result would start with an EM Pty String. The same holds for the end of the string: An empty string is incremented if it is matched to the head of the strings or the tail of the string
>>> re.split(r‘(\W+)‘, ‘...words, words...‘)[‘‘, ‘...‘, ‘words‘, ‘, ‘, ‘words‘, ‘...‘, ‘‘]
-------------------------------------------------------------------------------------
re.sub(pattern,repl, string, count = 0, flags = 0)
Use REPL to overwrite the characters that pattern matches in string without overlapping:
Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement R Epl. If the pattern isn ' t found, string is returned unchanged. Repl can be a string or a function; If it is a string, any backslash escapes in it is processed. That is, \ n is converted to a newline character, \ r is converted to a carriage return, and so forth. Unknown escapes such as & is left alone. Backreferences, such as \6, is replaced with the substring matched by group 6 in the pattern. For example:
>>> re.sub(r‘def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):‘,... r‘static PyObject*\npy_\1(void)\n{‘,... ‘def myfunc():‘)‘static PyObject*\npy_myfunc(void)\n{‘
Here def myfunction (): all matched, but with ([a-zA-Z_][a-zA-Z_0-9]*)
parentheses, so the matching myfunc is considered a group 1, and then the matching content is overwritten with REPL, due to string All are matched, so all are overwritten, and then the group 1 is replaced in the code \1
.
When Repl is a function:
If repl is a function, it's called for every non-overlapping occurrence of pattern. The function takes a single Match object argument, and returns the replacement string. For example:
>>> def dashrepl(matchobj):... if matchobj.group(0) == ‘-‘: return ‘ ‘... else: return ‘-‘>>> re.sub(‘-{1,2}‘, dashrepl, ‘pro----gram-files‘)‘pro--gram files‘>>> re.sub(r‘\sAND\s‘, ‘ & ‘, ‘Baked Beans And Spam‘, flags=re.IGNORECASE)‘Baked Beans & Spam‘
Regular Expressions re-pack 2018-10-02