Common regular of Python strings

Last Update:2018-01-10 Source: Internet

Author: User

Tags lua

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First, the previous simple write the RE module operation String, plus regular after re module to play a more powerful function.

First look at the common regular symbols:

Review the basic RE module:

ImportRetext='C + + python2 Python3 perl Ruby Lua Java javascript php4 php5 c'#match,search,findall,split,subRe.match (R'Java', text)#only your change, no words returned none returns a <_sre. Sre_match object; Span= (match=), ' Java ' >Re.search (R'Java', text)#find the matching characters from the beginning#<_sre. Sre_match object; Span= (match=), ' Java ' >Re.match (R'c\++', text), Re.match (R'c\+\+', text)#Same Effect#<_sre. Sre_match object; Span= (0, 3), match= ' C + + ' >Re.findall (R'python', text)#returns all of the Python#[' python ', ' python ']Re.split (R'Perl', text)#to split the center of a character#[' C + + python2 Python3 ', ' Ruby lua Java javascript php4 php5 c ']Re.sub (R'Ruby','Fortran', text)#Replace a character#' C + + python2 Python3 Perl Fortran lua Java javascript php4 php5 c '

Second, regular is commonly used

Text = ' C + + python2 Python3 perl Ruby Lua Java javascript php4 php5 c '

1 ^ Start matches starting from the beginning
Example: Re.findall (R ' ^c. ', text)
output #[' C + + ']


2. except \ n matches all characters except line break
Re.findall (R ' ^c ', text)
#[' C ')
Re.findall (R ' ^c. ', text)
#[' c+ ']


3 + 1-inf matches one or more of the same values from 1---infinity
Re.findall (R ' c+ ', text)
#[' C ', ' C ', ' C ']
Re.findall (R ' c\++ ', text)
#[' C + + ']

4 $ end matches last character
Re.findall (R ' C $ ', text)


5 [] or refers to or
Re.findall (R ' p[a-za-z]+ ', text) #匹配p字符后面是 (A-Z) lowercase characters A-Z and uppercase A-Z character #{1,} matches 1 to infinity
#[' python ', ' python ', ' Perl ', ' pt ', ' php ', ' php '

6 * 0-inf 0 to Infinity
Re.findall (R ' p[a-za-z]* ', text)
#[' python ', ' python ', ' Perl ', ' pt ', ' php ', ' php '

7? 0-1 Matching 0--1
Re.findall (R ' p[a-za-z]? ', text)
#[' py ', ' py ', ' pe ', ' PT ', ' ph ', ' P ', ' ph ', ' P ']
Re.findall (R ' p[a-za-z0-9]{3,} ', text) #{3,} refers to a match of three letters or more
#[' python2 ', ' Python3 ', ' Perl ', ' php4 ', ' php5 '

Re.findall (R ' c[a-za-z]* ', text)
#[' C ', ' cript ', ' C ']
Re.findall (R ' c[^a-za-z]* ', text) # ^ can also mean non-meaning (when the ^ sign inside the brackets) matches non-letter symbols
#[' C + + ', ' C ', ' C ']

8 | Or you can also write a | number to see the difference between him and []
Re.findall (R ' [pj][a-za-z]+ ', text) #{1,inf}
#[' python ', ' python ', ' Perl ', ' Java ', ' JavaScript ', ' php ', ' php '
| Rewrite the pattern above
Re.findall (R ' p|j[a-za-z]+ ', text) #| refers to the front or the back so you need to modify the program
#[' P ', ' P ', ' P ', ' Java ', ' JavaScript ', ' P ', ' P ', ' p ', ' P '
Re.findall (R ' p[a-za-z]+|j[a-za-z]+ ', text) #相当于 [pj][a-z][a-z] separate
Re.findall (R ' p[^0-9]+|j[a-za-z]+ ', text) #注意空格也会被匹配为非数字
#[' python ', ' python ', ' Perl ruby lua java javascript php ', ' php '
Re.findall (R ' p[^0-9]+|j[a-za-z]+ ', text)
#[' python ', ' python ', ' Perl ', ' Java ', ' JavaScript ', ' php ', ' php '

9 \w [a-za-z0-9_], \w #匹配所有的小写大写下划线 \w refers to \w's non-

Re.findall (R ' p\w+ ', text)

 #[' python2 ', ' Python3 ', ' Perl ', ' pt ', ' php4 ', ' php5 ']

ten \d [0-9], \d # #匹配所有的数字 \d is \d's non-
Re.findall (R ' p\w+\d ', text)
Re.findall (R ' p\w+[0-9] ', text)
Re.findall (R ' p\w{5,9} ', text) #匹配有5--9 characters
#[' Python2 ', ' Python3 ']

\s [\t\n\r\f\v], \s# matches all whitespace characters



\b Word boundary matches the bounds of a character to what begins what ends
Re.findall (R ' \bp[^0-9] ', text)
#[' py ', ' py ', ' pe ', ' ph ', ' ph ']
Re.findall (R ' p[^0-9]\b ', text)
#[' PT ']
\b Not \b
\a input Start, ^
\z input end, $ ibid.


14 greed and non-greed

  * greedy mode matches as many as possible  
  *?  0~inf non-greedy #非贪婪模式尽可能匹配少  
  +? 1~inf non-greedy #非贪婪模式尽可能匹配少  
 re.findall (R ' p[a-z]* ', text)  
 Span style= "font-size:16px" > #[' python ', ' python ', ' Perl ', ' pt ', ' php ', ' php '  
 re.findall (R ' p[a-z]*? ', text)  
  #[' P ', ' P ', ' P ', ' P ', ' P ', ' P ', ' P ', ' P ']  
  
 re.findall (R ' p[a-z]+?\b ', text)  
 
 15 Group

  (? P<name>pattern)  
 a=re.search (R ' (p[a-za-z]+) ([0-9]) ', ' Python2 ', re. X) #re. X can not write (re.x) compile characters inside can comment  
 a.group (1), A.group (2)  
  # ' python '  
  # ' 2 '  
 
 a=re.search (R ' (? p<name>p[a-za-z]+) (? P<version>[0-9]) ', ' Python2 ') #以字典形式输出  
 a.group (' name '), A.group (' Version ')  
 a.groupdict ()  
  #{' Name ': ' Python ', ' Version ': ' 2 '}

 16 mix write  
  
 results = Pattern.search (' python2 ') #带入  
 print (Results.groupdict ())  
  Results = Pattern.search (' Python3 ')  
 print (Results.groupdict ())  
 results = Pattern.search (' php4 ')  
 print (Results.groupdict ())  
  #{' name ': ' Python ', ' Version ': ' 3 '}

 17 dictionary loop  
  
 text = ' C + + python2 Python3 perl Ruby Lua Java javascript php4 php5 c ' 
      
 pattern = Re.compile (? p<name>p[a-za-z]+) (? P<VERSION>[0-9]) #公式  
 for T in Text.split ("):  
  results = Pattern.search (t)  
  If results:  
  print (Results.groupdict ())  
  #{' name ': ' Python ', ' Version ': ' 2 '}  
  #{' name ': ' Python ', ' Version ': ' 3 '}  
  #{' name ': ' php ', ' Version ': ' 4 '}  
  #{' name ': ' php ', ' version ' : ' 5 '}

18 compile character Re. X
A = Re.compile (r "" "\d +  # integral part
                   \.    # decimal point
                   \d *  # Number of decimal parts
                "" ", Re. X) #可以转化成一行
b = Re.compile (r "\d+\.\d*")

Common regular of Python strings

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More