Python (3) Regular Expression

Source: Internet
Author: User

There are a bunch of regular expression rules online, skipped first.

For beginners, it is recommended to use python in vs2010. The debugging function is very useful and you can easily see everything.

 

It is difficult to understand the difference between the greedy mode and the non-Greedy mode.

Sample Code:

Block = Re. sub (R' (. + ?) ', R'hello \ your void ()', R' * ABC. EFG * ') # non-Greedy mode, matching as few as possible

# Output result block = 'Hello * _ void () helloa_void () hellob_void () helloc_void () Hello. _ void () helloe_void () hellof_void () hellog_void () HELLO * _ void ()'

 

Block = Re. sub (R' (. +) ', r'hello \ your void ()', R' * ABC. EFG * ') # Greedy mode, matching as much as possible

# Output result block = 'Hello * ABC. EFG * _ void ()'

The above code has obvious differences between greedy and non-greedy.

 

Code:

line = re.sub(r'\*(.+?)\*', r'<em>\1</em>', 'adfdf*i am a worker.*dfd')

Matches the characters between two * signs and replaces them with <em> I am a worker. <em>

Looking back today, we found that some examples are still very beneficial. The following code can roughly describe the usage of Regular Expressions:

import re

m = re.match(r'(\w+) (\w+)(?P<sign>.*)', 'hello world!')

print "m.string:", m.string

print "m.re:", m.re

print "m.pos:", m.pos

print "m.endpos:", m.endpos

print "m.lastindex:", m.lastindex

print "m.lastgroup:", m.lastgroup 

print "m.group(1,2):", m.group(1, 2)

print "m.groups():", m.groups()

print "m.groupdict():", m.groupdict()

print "m.start(2):", m.start(2)

print "m.end(2):", m.end(2)

print "m.span(2):", m.span(2)

print r"m.expand(r'\2 \1\3'):", m.expand(r'\2 \1\3')

Output:

M. String: Hello world!
M. Re: <_ SRE. sre_pattern object at 0x024d93d8>
M. POS: 0
M. endpos: 12
M. lastindex: 3
M. lastgroup: Sign
M. group (1, 2): ('hello', 'World ')
M. Groups (): ('hello', 'World ','! ')
M. groupdict (): {'sign ':'! '}
M. Start (2): 6
M. End (2): 11
M. span (2): (6, 11)
M. Expand (R' \ 2 \ 1 \ 3 '): World Hello!

 

For compiled regular expressions, use re. Compile

 

import re

p = re.compile(r'(\w+) (\w+)(?P<sign>.*)', re.DOTALL)

 

print "p.pattern:", p.pattern

print "p.flags:", p.flags

print "p.groups:", p.groups

print "p.groupindex:", p.groupindex

Output:

P. Pattern: (\ W + )(? P <sign> .*)
P. Flags: 16
P. Groups: 3
P. groupindex: {'sign': 3}

 

Match (string [, POS [, endpos]) | re. Match (pattern, string [, flags]):
This method will try to match pattern from the string POS subscript; If pattern can still be matched at the end, a match object will be returned; If pattern cannot match during the matching process, or if the match is not completed and the endpos is reached, none is returned.
The default values of POs and endpos are 0 and Len (string), respectively. Re. Match () cannot specify these two parameters. The flags parameter is used to specify the matching mode when compiling pattern.
Note: This method does not fully match. When pattern ends, if the string contains any remaining characters, the operation is still considered successful. To perform a full match, you can add the boundary match '$' At the end of the expression '.
For an example, see section 2.1.

Search (string [, POS [, endpos]) | re. Search (pattern, string [, flags]):
This method is used to search for substrings that can be matched successfully in a string. Match pattern from the POs subscript of string. If pattern can still be matched at the end, a match object is returned. If it cannot be matched, add POs to 1 and try again; if the Pos = endpos still does not match, none is returned.
The default values of POs and endpos are 0 and Len (string) respectively. Re. Search () cannot specify these two parameters. The flags parameter is used to specify the matching mode when compiling pattern.

# encoding: UTF-8 

import re  

# Compile a regular expression into a pattern object

pattern = re.compile(r'world')  

# Search for matched substrings using search (). If no matched substrings exist, none is returned.

# In this example, match () cannot be successfully matched.

match = pattern.search('hello world!')  

if match:     

    print match.group() 

World

 

Split (string [, maxsplit]) | re. Split (pattern, string [, maxsplit]):
Split string by matching substrings and return to the list. Maxsplit is used to specify the maximum number of splits. If not specified, all splits are performed.

import re 

p = re.compile(r'\d+')

print p.split('one1two2three3four4')

Output:

['One', 'two', 'three ', 'four', '']

 

Findall (string [, POS [, endpos]) | re. findall (pattern, string [, flags]):
Search for strings and return all matching substrings in the form of a list.

import re 

p = re.compile(r'\d+')

print p.findall('one1two2three3four4')

['1', '2', '3', '4']

 

Finditer (string [, POS [, endpos]) | re. finditer (pattern, string [, flags]):
Returns an iterator that accesses each matching result (match object) sequentially.

import re

p = re.compile(r'\d+')

for m in p.finditer('one1two2three3four4'):   

    print m.group(),

1 2 3 4

 

Sub (repl, string [, Count]) | re. sub (pattern, REPL, string [, Count]):
Use repl to replace each matched substring in the string, and then return the replaced string.
When repl is a string, you can use \ ID, \ G <ID>, \ G <Name> to reference the group, but cannot use number 0.
When repl is a method, this method should only accept one parameter (match object) and return a string for replacement (the returned string cannot reference the group ).
Count is used to specify the maximum number of replicas. If not specified, all replicas are replaced.

import re 

p = re.compile(r'(\w+) (\w+)')

s = 'i say, hello world!' 

print p.sub(r'\2 \1', s) 

def func(m):    

    return m.group(1).title() + ' ' + m.group(2).title() 

 

print p.sub(func, s)

Say I, World Hello!
I say, hello World!

 

Subn (repl, string [, Count]) | re. sub (pattern, REPL, string [, Count]):
Returns (sub (repl, string [, Count]), replacement times ).

import re 

p = re.compile(r'(\w+) (\w+)')

s = 'i say, hello world!' 

print p.subn(r'\2 \1', s) 

def func(m):    

    return m.group(1).title() + ' ' + m.group(2).title() 

 

print p.subn(func, s)

('Say I, World Hello! ', 2)
('I say, hello World! ', 2)

 

Reference: http://www.cnblogs.com/huxi/archive/2010/07/04/1771073.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.