The Python module re

Source: Internet
Author: User

Common Regular Expression symbols

'.'default match any character except \ n, if flag Dotall is specified, matches any character, including line break'^'Match the beginning of the character, if you specify the flags MULTILINE, this can also match on (R"^a","\nabc\neee", flags=Re. MULTILINE)'$'Matches the end of a character, or E.search ("foo$","BFOO\NSDFSF", flags=Re. MULTILINE). Group () can also'*'Matches the character preceding the * number 0 or more times, Re.findall ("ab*","Cabb3abcbbac") results are ['ABB','AB','a']'+'Matches the previous character 1 or more times, Re.findall ("ab+","Ab+cd+abb+bba"Results'AB','ABB']'?'match a previous character 1 or 0 times'{m}'matches the previous character m times'{n,m}'Matches the previous character N to M times, Re.findall ("ab{1,3}","ABB ABC abbcbbb") Results'ABB','AB','ABB']'|'Match | left or | Right character, re.search ("abc| ABC","ABCBABCCD"). Group () results'ABC''(...)'Group matching, Re.search ("(ABC) {2}A (123|456) C","abcabca456c"). Group () result abcabca456c'\a'Match only from the beginning of the character, Re.search ("\AABC","ALEXABC") is not matched to the'\z'match character end, same as $'\d'Match number 0-9'\d'match non-numeric'\w'Match [a-za-z0-9]'\w'Match non-[a-za-z0-9]'s'Match whitespace characters, \ t, \ n, \ r, Re.search ("\s+","ab\tc1\n3"). Group () results'\ t' '(? P<name>, ...)'Group Matching Re.search ("(? P<province>[0-9]{4}) (? P<city>[0-9]{2}) (? P<BIRTHDAY>[0-9]{4})","371481199306143242"). Groupdict (" City"Results'Province':'3714',' City':'Bayi','Birthday':'1993'}

The most commonly used match syntax

1 Re.match match from the beginning 2 Re.search Match contains 3 Re.findall all matching characters to the elements in the list to return 4 Re.splitall as a list separator with matched characters 5 re.sub      match characters and replace

The haunting of the backslash
As with most programming languages, "\" is used as an escape character in regular expressions, which can cause a backslash to be plagued. If you need to match the character "\" in the text, then 4 backslashes "\\\\" will be required in the regular expression expressed in the programming language: the first two and the last two are used to escape the backslash in the programming language, converted to two backslashes, and then escaped in the regular expression into a backslash. The native string in Python solves this problem well, and the regular expression in this example can be expressed using R "\ \". Similarly, a "\\d" that matches a number can be written as r "\d". With the native string, you no longer have to worry about missing the backslash, and the expression is more intuitive.

Only a few matching patterns to be known lightly

Re. I (re. IGNORECASE): Ignore case (full notation within parentheses, same as) M (MULTILINE): Multiline mode, change '^' and '$ ' ' behavior (see) S (dotall): Point any matching pattern, change '. ' The Act
1 ImportRe2 3 #s = ' Hello World '4 #Print (S.find (' ll '))5 #ret=s.replace (' ll ', ' xx ')6 #print (ret)7 #Print (S.split (' W '))8 #Ret=re.findall ("w\w{2}l", ' Hello World ')9 #print (ret)Ten #Ret=re.findall ("Alex", ' hiudfgiusiohalexlkshd ') One #print (ret) A #. Wildcard characters - #Ret=re.findall ("W.. L ", ' Hello World ') #. refers to all characters (except for a newline character only. - #print (ret) the #^ Sharp angle character - #Ret=re.findall (' ^h. O ', ' Hjasdflhello ') #只在开始位置匹配 - #print (ret) - #$ + #Ret=re.findall (' H.. o$ ', ' Hjasdflhello ') #只在结尾位置匹配 - #print (ret) + #* Repeat match range [0,+oo] A #ret= re.findall (' A.*li ', ' Husihfiosalexlihuidh ') at #print (ret) - #+: [1,+oo] - #ret= re.findall (' A.+li ', ' Husihfiosalexlihuidh ') - #print (ret) - # ? [0,1] - #ret= Re.findall (' A.? Li ', ' Husihfiosalexlihuidh ') in #print (ret) -  to #{} Self-matching several times {1,3} matches one to three times + #ret=re.findall (' a{5}b ', ' Aaaaab ') - #print (ret) the #* equals {0, positive infinity} * #+ equals {1, positive infinity} $ #? equals {0,1}Panax Notoginseng  - #Character Set the  + #[] or in the relationship [], select one, A #ret=re.findall (' a[c,d]x ', ' acx ') the #print (ret) + #Special features for canceling metacharacters (\ ^-Exceptions) - #ret=re.findall (' a[c,*]x ', ' a*x ') $ #print (ret) $ #^ put in []: Take reverse - #Ret=re.findall (' [^4,5] ', ' ysdgufi4x245df ') - #print (ret) the #\ Backslash followed by meta-character removal special function - #backslash followed by ordinary character for special functionsWuyi #\d matches any decimal number; equivalent to [0-9] the #\d matches any non-numeric character; equivalent to [^0-9] - #\s matches any whitespace character; equivalent to [\t\n\r\f\v] Wu #\s matches any non-whitespace character; equivalent to [^\t\r\f\v] - #\w matches any alphanumeric character; equivalent to [a-za-z0-9] About #\w matches any non-alphanumeric character; equivalent to [^a-za-z0-9] $ #\b matches a word boundary; it means the position between the word and the space . - #Print (Re.findall (' \d{10} ', ' 9074892365982475896245692835 ')) - #Print (Re.findall (' \sasd ', ' Fak asd ')) - #Print (Re.findall (' \w ', ' Fak asd ')) A #Print (Re.findall (R ' i\b ', ' I am a LIST ') + #match the result of the first satisfying condition the #ret=re.search (' sb ', ' SHUKDSBJFHSB ') - #print (Ret.group ()) $  the #Ret=re.findall (r "\ \", "sdyfjd\\c") the #print (ret) the  the #() | grouping - #Ret=re.search (' (AS) + ', ' Sdfghjasas '). Group () in #print (ret) the #Print (Re.search (' (AS) |3 ', ' as '). Group ()) the  About #methods of regular Expressions the #1 FindAll () All results are returned the #2 Search () returns the first object to match, and the object can call the group () the #3 Match () returns only the first object that matches to the beginning of the string, and the object can call the group () + #4 Split (' [A, b] ') first divided by A to B - #5 Sub () Three parameters the first is the original content the second one is to replace the content after the third one is replaced the #6 Compile () creates a regular expression object, adding a rule. Obj=re.compile () obj.split ( )
View Code

The Python module re

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.