Import re--all regular expression-related functions are included in the RE module
Re.sub ()--string substitution
>>> import re>>> s= "BROAD ROAD" >>> re.sub (' road$ ', ' RD. ', s) ' North BROAD Rd. ' > >> s = "BROAD" >>> re.sub (' \\bROUAD$ ', ' RD. ', s) ' BROAD ' >>> s= ' BROAD ROAD APT. 3 ' >> > re.sub (R ' \broad$ ', ' Rd. ', s) ' BROAD ROAD APT. 3 ' >>> re.sub (R ' \broad\b ', ' Rd. ', s) ' BROAD Rd. Apt. 3 ‘
Note:
1) \b Indicates that there must be a delimiter on the left.
2) the ' R ' in front of the regular expression tells Python that no characters in the string need to be escaped. eg., ' \ t ' is a tab, R ' \ t ' is a character ' \ ' immediately following a character ' t '
Re.search ()-matches a string with a regular expression and returns a matching object if the match succeeds, or none if no match is successful
>>> Import re>>> pattern = ' ^m? M? m?$ ' >>> re.search (pattern, ' M ') <_sre. Sre_match object; span= (0, 1), match= ' M ' >>>> re.search (pattern, ' MM ') <_sre. Sre_match object; span= (0, 2), match= ' MM ' >>>> re.search (pattern, ' MMM ') <_sre. Sre_match object; Span= (0, 3), match= ' MMM ' >>>> re.search (pattern, ' MMMMM ') >>> re.search (Pattern, ') <_sre. Sre_match object; span= (0, 0), match= ' >
>>> Import re>>> pattern = ' ^m? M? m?$ ' >>> re.search (pattern, ' M ') <_sre. Sre_match object; span= (0, 1), match= ' M ' >>>> re.search (pattern, ' MM ') <_sre. Sre_match object; span= (0, 2), match= ' MM ' >>>> re.search (pattern, ' MMM ') <_sre. Sre_match object; Span= (0, 3), match= ' MMM ' >>>> re.search (pattern, ' MMMMM ') >>> re.search (Pattern, ') <_sre. Sre_match object; span= (0, 0), match= ' >
Note:
1)? --Indicates an optional match
2) m{0,3}--Indicates a match 0~3 times M
Loosely-Regular Expressions:
1. Whitespace characters are ignored. Spaces, tables, and carriage returns are not matched in regular expressions. If you want to match these characters, you need to add the escape character ' \ '.
2. Note Information (beginning with # until end of line) is ignored.
3. When using loose regular expressions, you need to pass the re. The verbose parameter.
>>> pattern = ' ^ #beginning of stringm{0,3} #thousands-0 to 3 Ms (cm| cd| D? c{0,3}) #hundreds -(CM), (CD), 0-300 (0 to 3 CS) or 500-800 # (D, followed by 0 to 3 Cs) (xc| xl| L? x{0,3}) #tens -90 (XC), Max (XL), 0-30 (0 to 3 xs), or 50~80 # (L, followed by 0 to 3 Xs) (ix|iv| V? i{0,3}) #ones-9 (IX), 4 (IV), 0-3 (0 to 3 are), #or 5~8 (v,followed by 0 to 3 are) $ #end of string ">>&G T Re.search (Pattern, ' M ', re. VERBOSE) <_sre. Sre_match object; span= (0, 1), match= ' M ' >>>> re.search (pattern, ' mcmlxxxix ', re. VERBOSE) <_sre. Sre_match object; span= (0, 9), match= ' MCMLXXXIX ' >
Case: Matching phone number
\d:--match all 0-9 numbers
\d:--match all characters except digits
+:--match one or more times
*:--matches 0 or more times
>>> Phonepattern = Re.compile (R ' (\d{3}) \d* (\d{3}) \d* (\d{4}) \d* (\d*) $ ') >>> Phonepattern.search (' Work 555.1212 #1234 '). Groups () (' 800 ', ' 555 ', ' 1212 ', ' 1234 ')
Regular expression symbols and their meanings:
$--End of string
^--string Start
X? --Match 0 or one x characters
x+:--match one or more x characters
x*:--Match 0 or more x characters
X{m,n}--Indicates a match m~n x character
X{n}--Indicates a match n times x character
(a|b|c)--Indicates matching A or B or C
(x)-This is a combination where the matched string is stored and the Re.search () returns the object's groups () method to get the matching value
\d:--match all 0-9 numbers
\d:--match all characters except digits
\b:--match a word boundary
Python Learning Note 4-Regular expressions