Introduction to the use of RE module functions in Python

Source: Internet
Author: User
Tags locale

The function of regular expressions is realized through the RE module in Python. The RE module provides some functions for finding, replacing, and separating strings based on regular expressions. This article mainly introduces the functions and functions commonly used in the RE module.

re module common functions

1. Match (pattern,string,flags=0)

The string is matched from the head of the string according to pattern, returning only the 1th occurrence of the successful object, otherwise, none. Flags represents the rule option.

 >>> import   re  >>> Str= " python:java:c   " >>> Re.match (r"  python  " , Str) #  Span style= "COLOR: #008000" > matches successfully  <_sre. Sre_match object at 0x0000000005c5fcc8>>>> str= " java:python:c  "  >>> Re.match (r  " python  , Str) #   Match failed  >>> 

2. Search (pattern,string,flags=0)

Matches a string in strings according to pattern, returning only the 1th occurrence of the successful object, otherwise, none.

 >>> import   re  >>> Str= " python:java:c   " >>> Re.search (r"  python  " , Str) #  Span style= "COLOR: #008000" > matches successfully  <_sre. Sre_match object at 0x00000000060d7d98>>>> str= " java:python:c  "  >>> Re.search (r  " python  , Str) #   The same match succeeds  <_sre. Sre_match object at 0x0000000005c5fcc8> 

3. Split (pattern,string,maxsplit=0)

The string,maxsplit represents the maximum number of separators based on pattern separation.

>>>ImportRe>>> str='Python:java:c'>>> Re.split (r':', STR)#Specify delimiter:['Python','Java','C']>>> str='Python:java:c'>>> Re.split (r':', str,1)#specify maximum number of splits['Python','Java:c']>>> Str ="python:java:shell| c++| Ruby">>> Re.split (r'[:|]', STR)#Specify multiple separators['Python','Java','Shell','C + +','Ruby']

4, compile (pattern,flags=0)

Compiles the regular expression pattern, returning a pattern object.

Import re>>> regex = R'Python'>>> str='python:java:c  '>>> p = re.compile (regex)>>> p.match (Str)<_sre. Sre_match Object at 0x00000000060d7d98>

Description: The Pattern object method, in addition to match (), also includes search (), FindAll (), Finditer ().

5. Sub (pattern,repl,string,count=0)

  replaces substrings in a string based on a specified regular expression. Pattern is a regular expression, REPL is the string used for substitution, string is the source string, and if count is 0, all results that match in string are returned. If count>0, returns the first count match results.

Import re>>> str='python:java:c'>>> re.sub (R'p.*n  ','Ruby', Str)'ruby:java:c'  print# does not change the original string python:java:c

6, SUBN (pattern,repl,string,count=0)

The function is the same as the sub (), which returns a two-tuple. The first element is the replacement result, and the 2nd element is the number of substitutions.

>>>ImportRe>>> str='Python:java:c'>>> Re.subn (r'p.*:','Ruby:', STR)#returns the number of replacements('Ruby:c', 1)>>> Re.subn (r'p.*?:','Ruby:', STR)#Notice the number of matches? The replacement content is different.('Ruby:java:c', 1)>>>

Description: ' p.*?: ' There are no question marks in the match condition. There is a difference. Do not add? Number is a greedy match.

7, FindAll (pattern,string,flags=0)

Matches a string in string according to pattern. If the match succeeds, returns a list containing the matching results, otherwise, an empty list is returned. However, when there are groupings in pattern, a list with multiple tuples is returned, one for each group.

 >>> import   re  >>> Regex = R " \w+   " #  \w to match any word character that includes an underscore  >>> str= " Span style= "COLOR: #800000" >python:java:c   " >>> p = Re.compile (regex)  >>> P.findall (STR) [  " python  " ,  " java  , "  c   "] 

After describing the main functions of the RE module, the flags parameter and the Re.compile () function in the function are emphasized here.

1. Re.flags Parameters

By looking at the prototype of the RE module function, you can see that the function parameters are almost always the flags parameter, which is used to set additional options for matching. For example, whether to ignore case, whether to support multi-line matching, and so on. The common RE module rule options are as follows:

I or ignorecase ignores case l or locale character set localization, used for multi-locale m or multiline multi-line matching s or Dotall to make. Matches all characters x, including \ n, or verbose ignores whitespace, line breaks in regular expressions, Easy to add comments U or Unicode \w, \w, \b, \b, \d, \d, \s, and \s will all use Unicode

Look at the usage with a case-insensitive instance:

Import re>>> str='python:java:c'>>> re.match (R'  Python'# match failed >>> Re.match (R'python' ) # plus re. I, Match success <_sre. Sre_match Object at 0x00000000060d7d98>

2. Re.compile () function

The parsing of regular expressions is time consuming, and it may be less efficient to match string searches using FindAll () multiple times. If you use the same rule to match strings more than once, you can use compile () to precompile, and the Compile function returns 1 pattern objects. The object has a series of methods for finding, replacing, or extending strings, providing a matching speed for strings. The properties and methods of the patter object are as follows

Pattern #获取当前使用的正则表达式match (string,flags=0) #同re. Match () search (string,flags=0) #同re. Searc () findall (string,flags=0) #查找所有符合pattern对象匹配条件的结果, returns 1 lists that contain matching results. Finditer (string,flags=0) #返回一个包含匹配结果的地址

In addition, the function compile () is usually used with match (), search (), group () to parse the regular expression containing the grouping. The grouping of regular expressions is counted from left to right, and the 1th occurrence of parentheses is marked by group 1th, and so on. There are also group No. 0, group No. 0, which stores the results of matching the entire regular expression. Match () and search () will return a match object that provides a series of methods and properties to manage the matching results. The methods and properties of the match object are as follows:

Group (index=0) #某个分组的匹配结果. The default matches the entire regular expression groups () #所有分组的匹配结果, and the result of each grouping consists of 1 lists returned

For example, match the ID number and get information about the year, month, and day in the ID.

>>> regex = R'[1-9][0-9]{5} (\d{4}) (\d{2}) (\d{2}) [0-9]{3}[0-9x]'>>> Str ='11010019950807532X'>>> p =re.compile (regex)>>> m =P.match (STR)>>>m.groups () ('1995',' ,',' -')>>> M.group (1)'1995'>>> M.group (2)' ,'>>> M.group (3)' -'
Typical examples

1. Any matching search for multiple keywords

>>>ImportRe>>> regex = R'python| java| C'>>> STR1 ='Hello Java'>>> STR2 ='Python Developer'>>> p =re.compile (regex)>>>P.findall (STR1) ['Java']>>>P.findall (STR2) ['Python']

Introduction to the use of RE module functions in Python

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.