Python (3) Regular Expression

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

There are a bunch of regular expression rules online, skipped first.

For beginners, it is recommended to use python in vs2010. The debugging function is very useful and you can easily see everything.

It is difficult to understand the difference between the greedy mode and the non-Greedy mode.

Sample Code:

Block = Re. sub (R' (. + ?) ', R'hello \ your void ()', R' * ABC. EFG * ') # non-Greedy mode, matching as few as possible

# Output result block = 'Hello * _ void () helloa_void () hellob_void () helloc_void () Hello. _ void () helloe_void () hellof_void () hellog_void () HELLO * _ void ()'

Block = Re. sub (R' (. +) ', r'hello \ your void ()', R' * ABC. EFG * ') # Greedy mode, matching as much as possible

# Output result block = 'Hello * ABC. EFG * _ void ()'

The above code has obvious differences between greedy and non-greedy.

Code:

line = re.sub(r'\*(.+?)\*', r'<em>\1</em>', 'adfdf*i am a worker.*dfd')

Matches the characters between two * signs and replaces them with <em> I am a worker. <em>

Looking back today, we found that some examples are still very beneficial. The following code can roughly describe the usage of Regular Expressions:

import re

m = re.match(r'(\w+) (\w+)(?P<sign>.*)', 'hello world!')

print "m.string:", m.string

print "m.re:", m.re

print "m.pos:", m.pos

print "m.endpos:", m.endpos

print "m.lastindex:", m.lastindex

print "m.lastgroup:", m.lastgroup

print "m.group(1,2):", m.group(1, 2)

print "m.groups():", m.groups()

print "m.groupdict():", m.groupdict()

print "m.start(2):", m.start(2)

print "m.end(2):", m.end(2)

print "m.span(2):", m.span(2)

print r"m.expand(r'\2 \1\3'):", m.expand(r'\2 \1\3')

Output:

M. String: Hello world!
M. Re: <_ SRE. sre_pattern object at 0x024d93d8>
M. POS: 0
M. endpos: 12
M. lastindex: 3
M. lastgroup: Sign
M. group (1, 2): ('hello', 'World ')
M. Groups (): ('hello', 'World ','! ')
M. groupdict (): {'sign ':'! '}
M. Start (2): 6
M. End (2): 11
M. span (2): (6, 11)
M. Expand (R' \ 2 \ 1 \ 3 '): World Hello!

For compiled regular expressions, use re. Compile

import re

p = re.compile(r'(\w+) (\w+)(?P<sign>.*)', re.DOTALL)

print "p.pattern:", p.pattern

print "p.flags:", p.flags

print "p.groups:", p.groups

print "p.groupindex:", p.groupindex

Output:

P. Pattern: (\ W + )(? P <sign> .*)
P. Flags: 16
P. Groups: 3
P. groupindex: {'sign': 3}

Match (string [, POS [, endpos]) | re. Match (pattern, string [, flags]):
This method will try to match pattern from the string POS subscript; If pattern can still be matched at the end, a match object will be returned; If pattern cannot match during the matching process, or if the match is not completed and the endpos is reached, none is returned.
The default values of POs and endpos are 0 and Len (string), respectively. Re. Match () cannot specify these two parameters. The flags parameter is used to specify the matching mode when compiling pattern.
Note: This method does not fully match. When pattern ends, if the string contains any remaining characters, the operation is still considered successful. To perform a full match, you can add the boundary match '$' At the end of the expression '.
For an example, see section 2.1.

Search (string [, POS [, endpos]) | re. Search (pattern, string [, flags]):
This method is used to search for substrings that can be matched successfully in a string. Match pattern from the POs subscript of string. If pattern can still be matched at the end, a match object is returned. If it cannot be matched, add POs to 1 and try again; if the Pos = endpos still does not match, none is returned.
The default values of POs and endpos are 0 and Len (string) respectively. Re. Search () cannot specify these two parameters. The flags parameter is used to specify the matching mode when compiling pattern.

# encoding: UTF-8

import re

# Compile a regular expression into a pattern object

pattern = re.compile(r'world')

# Search for matched substrings using search (). If no matched substrings exist, none is returned.

# In this example, match () cannot be successfully matched.

match = pattern.search('hello world!')

if match:

    print match.group()

World

Split (string [, maxsplit]) | re. Split (pattern, string [, maxsplit]):
Split string by matching substrings and return to the list. Maxsplit is used to specify the maximum number of splits. If not specified, all splits are performed.

import re

p = re.compile(r'\d+')

print p.split('one1two2three3four4')

Output:

['One', 'two', 'three ', 'four', '']

Findall (string [, POS [, endpos]) | re. findall (pattern, string [, flags]):
Search for strings and return all matching substrings in the form of a list.

import re

p = re.compile(r'\d+')

print p.findall('one1two2three3four4')

['1', '2', '3', '4']

Finditer (string [, POS [, endpos]) | re. finditer (pattern, string [, flags]):
Returns an iterator that accesses each matching result (match object) sequentially.

import re

p = re.compile(r'\d+')

for m in p.finditer('one1two2three3four4'):

    print m.group(),

1 2 3 4

Sub (repl, string [, Count]) | re. sub (pattern, REPL, string [, Count]):
Use repl to replace each matched substring in the string, and then return the replaced string.
When repl is a string, you can use \ ID, \ G <ID>, \ G <Name> to reference the group, but cannot use number 0.
When repl is a method, this method should only accept one parameter (match object) and return a string for replacement (the returned string cannot reference the group ).
Count is used to specify the maximum number of replicas. If not specified, all replicas are replaced.

import re

p = re.compile(r'(\w+) (\w+)')

s = 'i say, hello world!'

print p.sub(r'\2 \1', s)

def func(m):

    return m.group(1).title() + ' ' + m.group(2).title()

print p.sub(func, s)

Say I, World Hello!
I say, hello World!

Subn (repl, string [, Count]) | re. sub (pattern, REPL, string [, Count]):
Returns (sub (repl, string [, Count]), replacement times ).

import re

p = re.compile(r'(\w+) (\w+)')

s = 'i say, hello world!'

print p.subn(r'\2 \1', s)

def func(m):

    return m.group(1).title() + ' ' + m.group(2).title()

print p.subn(func, s)

('Say I, World Hello! ', 2)
('I say, hello World! ', 2)

Reference: http://www.cnblogs.com/huxi/archive/2010/07/04/1771073.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python (3) Regular Expression

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python (3) Regular Expression

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support