* * Reprinted from: http://www.cnblogs.com/alex3714/articles/5161349.html**
Re module common methods
- Re.match (rule, String): Default starting from the beginning of the match, this mode is
‘^‘ useless.
- Re.search ()
- Re.findall (): There is no
group way, put all matching characters to the list of elements returned
- Re.split (): Split
- Re.sub (): Match character and replace
Common Regular Expressions:
'. 'Default match except\Any character other than n, if flag Dotall is specified, matches any character, including line breaks' ^ 'Matches the beginning of the character, if you specify the flags MULTILINE, this can also be matched on (r "^a","\ nABC\ nEee ", flags=Re. MULTILINE)' $ 'Matches the end of a character, or E.search ("foo$","Bfoo\ nSDFSF ", flags=Re. MULTILINE). Group () can also' * 'The*Character before the number0Times or more, Re.findall ("ab*","Cabb3abcbbac") results are [' ABB ',' AB ',' A ']' + 'Match a previous character1Times or more, Re.findall ("ab+","Ab+cd+abb+bba"Results' AB ',' ABB ']'? 'Match a previous character1Times or0Times' {m} 'Matches the previous character m times' {n,m} 'Matches the previous character N to M times, Re.findall ("ab{1,3}","ABB ABC abbcbbb") Results' ABB ',' AB ',' ABB ']' | 'The|Left or|The right character, Re.search ("abc| ABC ","ABCBABCCD"). Group () results' ABC '' (...) 'Group matching, Re.search ("(ABC){2}A (123|456) C ","abcabca456c"). Group () result abcabca456c' \a 'Match only from the beginning of the character, Re.search ("\AABC","ALEXABC") is not matched to the' \z 'Match character end, same as $' \d 'Match numbers0-9' \d 'Match non-numeric' \w 'Match [A-Za-Z0-9]' \w 'Match non-[A-Za-Z0-9]' s 'Match whitespace characters,\T\N\R, Re.search ("\s+","AB\ tC1\ n3 "). Group () results'\ t' ' (? P<name>, ...) 'Group Matching Re.search ("(? P<PROVINCE>[0-9]{4})(? P<CITY>[0-9]{2})(? P<BIRTHDAY>[0-9]{4})","371481199306143242"). Groupdict ("City"Results' province ':' 3714 ',' City ':' Bayi ',' Birthday ':' 1993 '}
The haunting of the backslash
As with most programming languages, "" is used as an escape character in regular expressions, which can cause a backslash to be plagued. If you need to match the character "" in the text, then 4 backslashes "\ \" will be required in the regular expression expressed in the programming language: the first two and the last two are used to escape the backslash in the programming language, converted to two backslashes, and then escaped in the regular expression into a backslash. The native string in Python solves this problem well, and the regular expression in this example can be expressed using R "\". Similarly, a "\d" that matches a number can be written as r "\d". With the native string, you no longer have to worry about missing the backslash, and the expression is more intuitive.
Common matching patterns
== M(MULTINE): 多行模式,改变‘^‘和‘$‘= S(DOTALL): 点任意匹配模式,改变‘.‘的行为
Python basic knowledge of regular expression re module