The RE module contains support for regular expressions. A regular expression is a pattern that matches a text fragment, and the simplest regular expression is a normal string that can match itself.
An introduction to metacharacters in a regular expression:
. Called wildcards, can match any character (except line breaks)
^ Caret, matches the beginning of the string
$ matches the end of a string
* Matches the previous occurrence of the regular expression symbol 0 times to several times
+ Match the previous occurrence of the regular expression symbol 1 times to several times
? Match regular expression symbols that appear before 0 or 1 times
{n} matches the preceding regular expression symbol N times
{M,n} matches the preceding occurrence of the regular expression character appearing M to N times
[...] Character set, matching any one of the characters that appear in []
[^...] Reverses the character set, matching any character except []
[XY] matches characters from X to Y
Re1|re2 selectable mode, matching | left or right expression
(...) Grouping mode, sub-mode
\d and \d \d match any number 0-9 the same as [0-9],\d], matching any non-numeric
\w and \w \w match any number or character with [0-9a-za-z],\w and \w opposite
\b and \b \b match word boundaries, \b opposite
\a,\z \a matches the beginning of the string, \z matches the end of the string
Two-RE module methods:
Compile (Patten[,flag]) creates a pattern object based on the character containing the regular expression
Search (Patten,string[,flag]) searching for patterns in strings
Match (Pattern,string[,flag]) starts the matching pattern in the string
Split (Patten,string[,maxsplit) splits characters based on pattern matches
FindAll (Patten,string[,flag]) lists all occurrences of pattern matching in a string
Sub (patten,repl,string[,count=0]) replaces all pattern-matching portions of a string with REPL
Escape (String) escapes all special characters of the string
It is particularly important to note that the function re.compile () converts a regular expression into a pattern object, enabling a more efficient match. In fact, when you use search or match for matching, they will internally convert the string to a regular expression object. If the Complie () conversion is used, the conversion is not performed at each match, and the matching efficiency is improved.
The Re.search () method can be used as a matchobject or none, and therefore as a conditional statement. Re,match () will match at the beginning of the string, returning Matchobject or none.
Case:
# use Re.split () to divide the word some_text = ' alpha, beta,,,, Gama Delta ' re.split (R ' [,]+ ', Some_text) #结果为: [' alpha ', ' beta ', ' gama ', ' Delta ']
# Re.findall () returns the pattern of all occurrences in a list pat = Re.compile (R ' [a-za-z]+ ') Text = ' "Hm .... ERR--Is you sure? " He said,sounding insecure ' re.findall (pat,text) #结果为: [' Hm ', ' Err ', ' is ', ' you ', ' sure ', ' he ', ' said ', ' sounding ', ' Insecure ']
# Re.findall () returns the punctuation character of just the string as a list Pat = Re.compile (R ' [? \-., ']+ ') #注意-Escaped text = ' "Hm .... ERR--Is you sure? " He said,sounding insecure ' print Re.findall (pat,text) #结果为: [' ' ' ', ' ... ', '? ' ', ', ']
# re.sub () complete mode replace Pat = Re.compile (R ' \{name\} ') #注意-escaped text = ' Dear {name} ' Print re.sub (Pat, ' Mr. Tange ', text)
The RE module in Python