Detailed introduction to the re module of Python Standard Library Learning

Last Update:2017-05-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article introduces the re module of the Python standard library.

The re module provides a series of powerful regular expression tools that allow you to quickly check whether a given string matches a given pattern (match function ), or include this mode (search function ). Regular expressions are strings written in a compact (and mysterious) syntax.

1. common methods

Common methods	Description
Match (pattern, string, flags = 0)	If the start of the string matches the regular expression pattern, the corresponding MatchObject instance is returned. otherwise, None is returned.
Search (pattern, string, flags = 0)	Scan string. if a position matches the regular expression pattern, an instance of MatchObject is returned. otherwise, None is returned.
Sub (pattern, repl, string, count = 0, flags = 0)	Replace the part that matches the pattern in the string with the repl, and replace the count at most.
Subn (pattern, repl, string, count = 0, flags = 0)	Similar to sub, subn returns a string after replacement and a tuple consisting of the number of matching times.
Split (pattern, string, maxsplit = 0, flags = 0)	Use the string matched by pattern to split the string
Findall (pattern, string, flags = 0)	Returns the string matching pattern in the string in the form of a list.
Compile (pattern, flags = 0) compile (pattern, flags = 0)	Compile a regular expression pattern into a regular object so that you can use the match and search methods of the regular object.
Purge ()	Clear the regular expression cache
Escape (string)	Add backslashes to all characters except letters and numbers in string.

2. Special match characters

Syntax	Description
.	Match any character except line breaks
^	Header match
$	Tail match
*	Match the first character 0 or multiple times
+	Match the previous character once or multiple times
?	Match the first character 0 times or once
{M, n}	Match the first character m to n times
\	Escape any special character
[]	Used to indicate a character set combination
\|	Or, it indicates any match between the left and right

3. module method re. match (pattern, string, flags = 0)

Match from the start of the string. if pattern matches, an instance of the Match object is returned (the Match object is described later). otherwise, None is returned. Flags is a matching mode (described below) used to control the matching mode of regular expressions.

Import rea = 'abcdefg' print re. match (r 'ABC', a) # print re. match (r 'ABC', ). group () print re. match (r 'Cde', a) # match failed >>>< _ sre. SRE_Match object at 0x0000000001D94578 >>>> abc >>> None

Search (pattern, string, flags = 0)

It is used to find the child string that can be matched successfully. if it is found, a Match object instance is returned; otherwise, None is returned.

import rea = 'abcdefg'print re.search(r'bc', a)print re.search(r'bc', a).group()print re.search(r'123', a)>>><_sre.SRE_Match object at 0x0000000001D94578>>>>bc>>>None

Sub (pattern, repl, string, count = 0, flags = 0)

Replace: replace the part that matches the pattern in the string with the repl, and replace the count at most (the remaining match will not be processed), and then return the replaced string.

Import rea = 'a1b2c3' print re. sub (r' \ d + ', '0', a) # replace the number with '0' print re. sub (r '\ s +', '0', a) # Replace the blank character with '0' >>> a0b0c0 >>> a1b2c3

Subn (pattern, repl, string, count = 0, flags = 0)

Like the sub () function, it only returns a tuples that contain the new string and the number of times it matches.

Import rea = 'a1b2c3' print re. subn (r' \ d + ', '0', a) # replace the number with '0' >>> ('a0b0c0', 3)

Split (pattern, string, maxsplit = 0, flags = 0)

In the regular expression, split () uses a substring that matches pattern to split the string. If parentheses are used in pattern, the string that is matched by pattern will also be part of the return value list, maxsplit is the string to be split at most.

import rea = 'a1b1c'print re.split(r'\d', a)print re.split(r'(\d)', a)>>>['a', 'b', 'c']>>>['a', '1', 'b', '1', 'c']

Findall (pattern, string, flags = 0)

Returns a list of non-overlapping substrings matching the pattern in the string.

import rea = 'a1b2c3d4'print re.findall('\d', a)>>>['1', '2', '3', '4']

4. Match object

Re. match (), re. if search () is matched successfully, a Match object will be returned. it contains a lot of information about the matching. you can use the attributes or methods provided by Match to obtain the information. For example:

>>> Import re >>> str = 'he has 2 books and 1 pen '>>> ob = re. search ('(\ d +)', str) >>> print ob. string # The text he has 2 books and 1 pen used for matching >>> print ob. re # The Pattern object re used for matching. compile (r' (\ d +) ')> print ob. group () # obtain the string intercepted by one or more groups. 2 >>> print ob. groups () # returns the string intercepted by all groups in the form of tuples ('2 ',)

5. Pattern object

The Pattern object is returned by re. compile (). it carries many methods with the same name as the re module, and the functions of the methods are similar. For example:

>>>import re>>>pa = re.compile('(d\+)')>>>print pa.split('he has 2 books and 1 pen')['he has ', '2', ' books and ', '1', ' pen']>>>print pa.findall('he has 2 books and 1 pen')['2', '1']>>>print pa.sub('much', 'he has 2 books and 1 pen')he has much books and much pen

6. matching mode

The value of the matching mode can take effect simultaneously using the bitwise OR operator '|'. for example, re. I | re. M. Below are some common flags.

Re. I (re. IGNORECASE): Case insensitive

>>>pa = re.compile('abc', re.I)>>>pa.findall('AbCdEfG')>>>['AbC']

Re. L (re. LOCALE): localized character set

This function is used to support multi-language character sets. for example\wIn English, it represents[a-zA-Z0-9]English characters and numbers. If you use it in a French environment, some French strings cannot match. Add the L option to match. However, this seems useless for Chinese environments, and it still cannot match Chinese characters.

Re. M (re. MULTILINE): multi-line mode, changing the behavior of '^' and '$'

>>>pa = re.compile('^\d+')>>>pa.findall('123 456\n789 012\n345 678')>>>['123']>>>pa_m = re.compile('^\d+', re.M)>>>pa_m.findall('123 456\n789 012\n345 678')>>>['123', '789', '345']

Re. S (re. DOTALL): Any point matching mode, changing the behavior '.'

　　.Will match all characters. By default.Match linefeed\nWith this option, the DOT can match any character including the line break.

Re. U (re. UNICODE): parses characters based on the Unicode character set
Re. X (re. VERBOSE): VERBOSE mode

# In this mode, the regular expression can be multiple rows, Ignore blank characters, and add comments. The following two regular expressions are equivalent to a = re. compile (r "\ d + # the integral part \. # the decimal point \ d * # some fractional digits ", re. x) B = re. compile (r "\ d + \. \ d * ") # In this mode, if you want to match a space, you must use the '/' format ('/' followed by a space)

The above is a detailed introduction to the re module of the Python standard library. For more information, see other related articles in the first PHP community!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Detailed introduction to the re module of Python Standard Library Learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Detailed introduction to the re module of Python Standard Library Learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support