Python Re module learning--Regular expression functions

Last Update:2015-02-11 Source: Internet

Author: User

Tags function prototype

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article mainly introduces the regular expression handler functions commonly used in Python. The syntax for the regular expressions in Python will summarize one more blog post.

Re.match

Re.match tries to match a pattern from the beginning of the string, such as: The following example matches the first word.

The code is as follows:

#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re
Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
m = Re.match (r "(\w+) \s", text)
If M:
Print M.group (0), ' \ n ', M.group (1)
Else
print ' not match '

The results of the implementation are as follows:

[email protected] oldboy]# python a.py
Jgood
Jgood

If text = "#JGood is a handsome boy, he's cool, clever, and so on ..." The execution results are as follows:

[email protected] oldboy]# python a.py
Not match

#me: Supplement

Group and groups are two different functions

Generally, M.group (n) returns the character of the nth set of parentheses

and M.group () ==m.group (0) = = All matching characters, independent of parentheses

M.groups () returns all parentheses matching characters in the format of tuple

M.groups () = = (M.group (0), M.group (1),...)

Re.match's function prototype is: Re.match (pattern, string, flags)

The first parameter is the regular expression (which requires you to specify the corresponding R prefix), here is "(\w+) \s", and if the match succeeds, returns a match, otherwise returns a none;

The second parameter represents the string to match;

The third parameter is the Peugeot bit, which controls how regular expressions are matched, such as case sensitivity, multiline matching, and so on.

Re.search

the Re.search function will be within the entire string find pattern matching, only to find the first match and then return if the string does not match, then none is returned.

The code is as follows:
#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re

Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
m = Re.search (R ' \shan (ds) ome\s ', text)
If M:
Print M.group (0), M.group (1)
Else
print ' Not search '

The results of the implementation are as follows:

[email protected] oldboy]# python b.py
Handsome DS

Re.search's function prototype is: Re.search (pattern, string, flags)

Each parameter has the same meaning as Re.match.

The difference between Re.match and Re.search:

Match: Matches only from the beginning of the string to the regular expression, the match returns Matchobject successfully, otherwise none is returned;

Search: to match all string attempts to the regular expression, and if all strings are not successfully matched, return none, otherwise return matchobject;

re.sub

The re.sub is used to replace a match in a string. The following example replaces a space in a string with a '-':

The code is as follows:

#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re

Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
Print re.sub (R ' \s+ ', '-', text)

The results of the implementation are as follows:

[email protected] oldboy]# python a.py
Jgood-is-a-handsome-boy,-he-is-cool,-clever,-and-so-on ...

Re.sub's function prototype is: re.sub (Pattern, REPL, string, count)

Where the first parameter is a regular expression to match;

The second parameter is the replaced string; In this case '-'

The fourth parameter refers to the number of replacements. The default is 0, which means that each match is replaced.

Re.sub also allows for complex processing of replacements for matches using functions. such as: Re.sub (R ' \s ', Lambda m: ' [' + m.group (0) + '] ', text, 0); Replace the space in the string ' ' with ' [] '.

The code is as follows:

#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re

Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
Print re.sub (R ' \s ', Lambda m: ' [' + m.group (0) + '] ', text, 0);

The results of the implementation are as follows:

[email protected] oldboy]# python a.py
jgood[]is[]a[]handsome[]boy,[]he[]is[]cool,[]clever,[]and[]so[]on ...

Add: Aparameter such as backslash plus number (\ n) is supported in the pattern of re.sub (similar to \1 in SED)

The sample code is as follows:

#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re
Inputstr = "Hello Crifan, Nihao crifan";
Replacedstr = re.sub (r "Hello (\w+), Nihao \1", "Crifanli", inputstr);
Print "replacedstr=", replacedstr;

The results of the implementation are as follows:

[email protected] oldboy]# python a.py
Replacedstr= Crifanli

re.subn

The same as the Re.sub method, but returns a two-tuple that contains the new string and the number of substitution executions.

The code examples are as follows:

#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re

Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
Print re.subn (R ' \s+ ', '-', text)

The results of the implementation are as follows:

[email protected] oldboy]# python a.py
(' jgood-is-a-handsome-boy,-he-is-cool,-clever,-and-so-on ... ', 11)

Re.split

You can use Re.split to split a string (text) with a specified character (or a character that matches a regular expression), such as: Re.split (R ' \s+ ', text), and a string separated by a space into a list of words. -----The return value is a list type.

The code is as follows:

#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re

Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
Print Re.split (R ' \s+ ', text);

The results of the implementation are as follows:

[email protected] oldboy]# python a.py
[' Jgood ', ' is ', ' a ', ' handsome ', ' Boy, ', ' he ', ' was ', ' cool, ', ' clever, ', ' and ', ' so ', ' on ' ... ']

#me: Supplement

Re.split (pattern,string.maxsplit=0)
Maxsplit is the number of separations, the maxsplit=1 is separated once, the default is 0, the number of times is not limited.

Separates a string from a regular expression. If you enclose the regular expression in parentheses, the matching string is also returned in the list.

>>> re.split (' \w+ ', ' Words, Words, Words. ')
[' Words ', ' Words ', ' Words ', ']
>>> re.split (' (\w+) ', ' Words, Words, Words. ')
[' Words ', ', ', ' Words ', ', ', ' Words ', '. ', ']
>>> re.split (' \w+ ', ' Words, Words, Words. ', 1)
[' Words ', ' Words, Words. ']

Re.findall

Re.findall can get all the strings in the string (text) that match the regular expression. such as: Re.findall (R ' \w*oo\w* ', text); Gets all the words in the string that contain ' oo '. -----The return value is a list type.

code example below;

#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re

Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
Print Re.findall (R ' \w*oo\w* ', text);

The results of the implementation are as follows:

[email protected] oldboy]# python a.py
[' Jgood ', ' cool ']

Re.finditer

Find all the substrings that the RE matches and return them as an iterator (iterator). This match is returned from left to right in an orderly manner. If there is no match, an empty list is returned.

The code examples are as follows:

#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re
it = Re.finditer (r "\d+", "12A32BC43JF3")
For match in it:
Print Match.group ()

The results of the implementation are as follows:

[email protected] oldboy]# python a.py
12
32
43
3

Re.compile

You can compile a regular expression into a regular expression object. It is possible to compile regular expressions that are often used as regular expression objects, which can improve some efficiency. Here is an example of a regular expression object:

The code examples are as follows:

#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re

Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
Regex = Re.compile (R ' \w*oo\w* ')
Print Regex.findall (text) #查找所有包含 ' oo ' word
Print regex.sub (lambda m: ' [' + m.group (0) + '] ', text) #将字符串中含有 ' oo ' words are enclosed in [].

The results of the implementation are as follows:

[email protected] oldboy]# python a.py
[' Jgood ', ' cool ']
[Jgood] is a handsome boy, he's [cool], clever, and so on ...

Python Re module learning--Regular expression functions

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More