This article mainly introduces the regular expression handler functions commonly used in Python. The syntax for the regular expressions in Python will summarize one more blog post.
Re.match
Re.match tries to match a pattern from the beginning of the string, such as: The following example matches the first word.
The code is as follows:
#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re
Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
m = Re.match (r "(\w+) \s", text)
If M:
Print M.group (0), ' \ n ', M.group (1)
Else
print ' not match '
The results of the implementation are as follows:
[email protected] oldboy]# python a.py
Jgood
Jgood
If text = "#JGood is a handsome boy, he's cool, clever, and so on ..." The execution results are as follows:
[email protected] oldboy]# python a.py
Not match
#me: Supplement
Group and groups are two different functions
Generally, M.group (n) returns the character of the nth set of parentheses
and M.group () ==m.group (0) = = All matching characters, independent of parentheses
M.groups () returns all parentheses matching characters in the format of tuple
M.groups () = = (M.group (0), M.group (1),...)
Re.match's function prototype is: Re.match (pattern, string, flags)
The first parameter is the regular expression (which requires you to specify the corresponding R prefix), here is "(\w+) \s", and if the match succeeds, returns a match, otherwise returns a none;
The second parameter represents the string to match;
The third parameter is the Peugeot bit, which controls how regular expressions are matched, such as case sensitivity, multiline matching, and so on.
Re.search
the Re.search function will be within the entire string find pattern matching, only to find the first match and then return if the string does not match, then none is returned.
The code is as follows:
#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re
Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
m = Re.search (R ' \shan (ds) ome\s ', text)
If M:
Print M.group (0), M.group (1)
Else
print ' Not search '
The results of the implementation are as follows:
[email protected] oldboy]# python b.py
Handsome DS
Re.search's function prototype is: Re.search (pattern, string, flags)
Each parameter has the same meaning as Re.match.
The difference between Re.match and Re.search:
Match: Matches only from the beginning of the string to the regular expression, the match returns Matchobject successfully, otherwise none is returned;
Search: to match all string attempts to the regular expression, and if all strings are not successfully matched, return none, otherwise return matchobject;
re.sub
The re.sub is used to replace a match in a string. The following example replaces a space in a string with a '-':
The code is as follows:
#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re
Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
Print re.sub (R ' \s+ ', '-', text)
The results of the implementation are as follows:
[email protected] oldboy]# python a.py
Jgood-is-a-handsome-boy,-he-is-cool,-clever,-and-so-on ...
Re.sub's function prototype is: re.sub (Pattern, REPL, string, count)
Where the first parameter is a regular expression to match;
The second parameter is the replaced string; In this case '-'
The fourth parameter refers to the number of replacements. The default is 0, which means that each match is replaced.
Re.sub also allows for complex processing of replacements for matches using functions. such as: Re.sub (R ' \s ', Lambda m: ' [' + m.group (0) + '] ', text, 0); Replace the space in the string ' ' with ' [] '.
The code is as follows:
#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re
Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
Print re.sub (R ' \s ', Lambda m: ' [' + m.group (0) + '] ', text, 0);
The results of the implementation are as follows:
[email protected] oldboy]# python a.py
jgood[]is[]a[]handsome[]boy,[]he[]is[]cool,[]clever,[]and[]so[]on ...
Add: Aparameter such as backslash plus number (\ n) is supported in the pattern of re.sub (similar to \1 in SED)
The sample code is as follows:
#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re
Inputstr = "Hello Crifan, Nihao crifan";
Replacedstr = re.sub (r "Hello (\w+), Nihao \1", "Crifanli", inputstr);
Print "replacedstr=", replacedstr;
The results of the implementation are as follows:
[email protected] oldboy]# python a.py
Replacedstr= Crifanli
re.subn
The same as the Re.sub method, but returns a two-tuple that contains the new string and the number of substitution executions.
The code examples are as follows:
#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re
Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
Print re.subn (R ' \s+ ', '-', text)
The results of the implementation are as follows:
[email protected] oldboy]# python a.py
(' jgood-is-a-handsome-boy,-he-is-cool,-clever,-and-so-on ... ', 11)
Re.split
You can use Re.split to split a string (text) with a specified character (or a character that matches a regular expression), such as: Re.split (R ' \s+ ', text), and a string separated by a space into a list of words. -----The return value is a list type.
The code is as follows:
#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re
Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
Print Re.split (R ' \s+ ', text);
The results of the implementation are as follows:
[email protected] oldboy]# python a.py
[' Jgood ', ' is ', ' a ', ' handsome ', ' Boy, ', ' he ', ' was ', ' cool, ', ' clever, ', ' and ', ' so ', ' on ' ... ']
#me: Supplement
Re.split (pattern,string.maxsplit=0)
Maxsplit is the number of separations, the maxsplit=1 is separated once, the default is 0, the number of times is not limited.
Separates a string from a regular expression. If you enclose the regular expression in parentheses, the matching string is also returned in the list.
>>> re.split (' \w+ ', ' Words, Words, Words. ')
[' Words ', ' Words ', ' Words ', ']
>>> re.split (' (\w+) ', ' Words, Words, Words. ')
[' Words ', ', ', ' Words ', ', ', ' Words ', '. ', ']
>>> re.split (' \w+ ', ' Words, Words, Words. ', 1)
[' Words ', ' Words, Words. ']
Re.findall
Re.findall can get all the strings in the string (text) that match the regular expression. such as: Re.findall (R ' \w*oo\w* ', text); Gets all the words in the string that contain ' oo '. -----The return value is a list type.
code example below;
#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re
Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
Print Re.findall (R ' \w*oo\w* ', text);
The results of the implementation are as follows:
[email protected] oldboy]# python a.py
[' Jgood ', ' cool ']
Re.finditer
Find all the substrings that the RE matches and return them as an iterator (iterator). This match is returned from left to right in an orderly manner. If there is no match, an empty list is returned.
The code examples are as follows:
#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re
it = Re.finditer (r "\d+", "12A32BC43JF3")
For match in it:
Print Match.group ()
The results of the implementation are as follows:
[email protected] oldboy]# python a.py
12
32
43
3
Re.compile
You can compile a regular expression into a regular expression object. It is possible to compile regular expressions that are often used as regular expression objects, which can improve some efficiency. Here is an example of a regular expression object:
The code examples are as follows:
#!/usr/bin/env python
#-*-Coding:utf-8-*-
Import re
Text = "Jgood is a handsome boy, he's cool, clever, and so on ..."
Regex = Re.compile (R ' \w*oo\w* ')
Print Regex.findall (text) #查找所有包含 ' oo ' word
Print regex.sub (lambda m: ' [' + m.group (0) + '] ', text) #将字符串中含有 ' oo ' words are enclosed in [].
The results of the implementation are as follows:
[email protected] oldboy]# python a.py
[' Jgood ', ' cool ']
[Jgood] is a handsome boy, he's [cool], clever, and so on ...
Python Re module learning--Regular expression functions