Python expert path [5] python-based regular expressions and python Regular Expressions

Source: Internet
Author: User

Python expert path [5] python-based regular expressions and python Regular Expressions

Lists the Python-supported regular expression metacharacters and syntaxes:

Character point: match any character

import rest = 'python'result = re.findall('p.t',st)print(result)

Character ^: Start with a match

import rest = 'python'result = re.findall('^py',st)print(result)

Character $: End of a match

import rest = 'python'result = re.findall('n$',st)print(result)

Character *: match any time, including 0

Import rest = 'I looooooove python' result = re. findall ('lo * Ve', st) # The character 0 can be absent, and there can be no more than one character. Both can match print (result)

Character +: match once or multiple times

Import rest = 'I looooooove python' result = re. findall ('lo + Ve', st) # print (result) cannot be matched if the character 0 does not exist)

Character? : Match 0 times or once

Import rest = 'I love python' result = re. findall ('lo? Ve ', st) # No character 0 can match print (result)

{M}: match the m times of the previous character

Import rest = 'I loooove python' result = re. findall ('o {3}', st) # match 3 o characters print (result)

{M, n}: match the m-n times of the previous character

import rest = 'I loooove python'result = re.findall('lo{1,4}ve',st)print(result)

[Abc] or [a-c]: match any character in []

import rest = 'I loooove python'result = re.findall('l[0-z]*e',st)print(result)

[A | B]: Match character a or character B

import rest = 'I lbve python'result = re.findall('l[a|b]ve',st)print(result)

[^ 1-9]: [] contains the ^ character, indicating non-meaning, not starting with what

import rest = 'I lb2ve python6'result = re.findall('[^0-9]',st)print(result)##########################################['I', ' ', 'l', 'b', 'v', 'e', ' ', 'p', 'y', 't', 'h', 'o', 'n']

\:

  • Special features for removing backslash followed by metacharacters
  • Special functions are implemented by backslash followed by common characters
  • String matched by the word group corresponding to the reference serial number
Greedy and non-Greedy modes of quantifiers

Regular Expressions are usually used to search for matched strings in the text. In Python, quantifiers are greedy by default (in a few languages, they may also be non-Greedy by default), and always try to match as many characters as possible; in non-greedy, the opposite is true, always try to match as few characters as possible. For example, if the regular expression "AB *" is used to find "abbbc", "abbb" is found ". If we use a non-Greedy quantizer "AB *? "," A "is found ".

import reresult = re.findall(r'ab*','abbbc')print(result)##########################################['abbb']
Import reresult = re. findall (r' AB *? ', 'Abbbc') # cancel greedy mode print (result) ######################################## # ['a']
Re. match () match from scratch
Import reorigin = "hello poe bcd jet who are you 20" r = re. match ("h \ w +", origin) print (r. group () # obtain all matched results print (r. groups () # obtain the matching grouping result print (r. groupdict ()) # obtain the group results matching in the model ############################### ########## hello () {}
R = re. match ("(h) (\ w +)", origin) print (r. group () # obtain all matched results print (r. groups () # obtain the matching grouping result print (r. groupdict ()) # obtain the group results matching in the model ############################### ########## hello ('h ', 'Ello '){}
R = re. match ("(? P <n1> h )(? P <n2> \ w +) ", origin )#? P <n1>: Use the key as n1 and the value as the matched group and save it to the dictionary !? P <> This is a fixed method of print (r. group () # obtain all matched results print (r. groups () # obtain the matching grouping result print (r. groupdict ()) # obtain the group results matching in the model ############################### ########## hello ('h ', 'Ello ') {'n2': 'Ello ', 'n1': 'H '}
Re. search () Browse all strings and match the first matching string

Similar to re. match (),

Import reorigin = "hello poe bcd jet poe who are you 20" r = re. search ("p (\ w + ).*(? P <name> \ d) $ ", origin )#? P <n1>: Use the key as n1 and the value as the matched group and save it to the dictionary !? P <> This is a fixed method of print (r. group () # obtain all matched results print (r. groups () # obtain the matching grouping result print (r. groupdict ()) # obtain the group results matching in the model ############################### ########## poe bcd jet poe who are you 20 ('oe ', '0') {'name': '0 '}
Re. findall () puts all matched content in a list

Note: Empty match will also be saved to the result, for example:

result = re.findall("","a2b3c4d5")print(result)print(len(result))##########################################['', '', '', '', '', '', '', '', '']

The re. findall () method must be grouped:

# If no group exists, r = re. findall ("p \ w +", origin) print (r) ######################################## # ['poe ', 'poe ']
# If a group exists, the matched group will be placed in the result list r = re. findall ("p (\ w +)", origin) print (r) ######################################## # ['oe ', 'oe ']
Re. finditer ()
import reorigin = "hello poe bcd jet poe who are you 20"r = re.finditer("(p)(\w+(e))",origin)for i in r :    print(i.group())    print(i.groups())    print(i.groupdict()) 
Re. split ()

If no group exists, the matched string will not appear in the matching result:

 

import reorigin = "hello poe bcd jet poe who are you 20"r = re.split("a\w+",origin,1)print(r)##########################################['hello poe bcd jet poe who ', ' you 20']

 

If a group exists, the matched group string will also appear in the matching result:

import reorigin = "hello poe bcd jet poe who are you 20"r = re.split("a(\w+)",origin,1)print(r)##########################################['hello poe bcd jet poe who ', 're', ' you 20']

 

Re. sub () Regular Expression replacement
Import reorigin = "1yiuoosfd234kuiuadf789v, xznfa978" new_str = re. sub ("\ d +", "KKK", origin, 1) # parameter 1 indicates that only the first matched string is replaced, if it is 2, replace the first two matched strings print (new_str) ######################################## # KKKyiuoosfdKKKkuiuadf789v, xznfa978

Re. subn () Only returns one more data than re. sub (), for example:

Import reorigin = "1yiuoosfd234kuiuadf789v, xznfa978" new_str, count = re. subn ("\ d +", "KKK", origin) # parameter 1 indicates that only the first matched string print (new_str, count) is replaced) ######################################## # kkkyiuoosfdkkkkkuiuadfkkkv, xznfaKKK 4

This 4 indicates that the replacement matches four times.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.