Python re (Regular Expression) Module

Source: Internet
Author: User

Re Module

Python can use the re module to match strings using regular expressions ~ View the methods provided by re in the/instils/python/lib/python2.7/re. py file. The following interfaces are mainly used:

L def match (pattern, string, flags = 0 ):

"" Try toapply the pattern at the start of the string, returning

A match object, or None ifno match was found ."""

Return _ compile (pattern, flags). match (string)

Re. match matches a pattern starting from a string. The first parameter is a regular expression, the second string is the string to be matched, and the third parameter is the flag. The default value is 0; if yes, a match object is returned; otherwise, None is returned.

L def search (pattern, string, flags = 0 ):

"Scan through string looking for a match to thepattern, returning

Amatch object, or None if no match was found ."""

Return_compile (pattern, flags). search (string)

The re. search function searches the string in the search mode and exits after finding the first one. If None is not found, the return value is None. The parameter is consistent with the re. match parameter. The difference with match is that match only matches the start of a string, while search matches the entire string.

L def findall (pattern, string, flags = 0 ):

"" Return a list of all non-overlapping matches in thestring.

If one or more groups are present in the pattern, return

List of groups; this will be a list of tuples if the pattern

Has more than one group.

Empty matches are encoded in the result ."""

Return_compile (pattern, flags). findall (string)

Re. findall can obtain all matched strings and return them in the form of list.

L def compile (pattern, flags = 0 ):

"Compile a regular expression pattern, returning a patternobject ."

Return_compile (pattern, flags)

Re. compile can compile a regular expression into a regular expression object, and compile regular expressions into regular expression objects to improve matching efficiency.

The search () and match () methods mentioned above return the match object. The following describes the attributes and methods of the match object.

Matchobject

Attribute:

    String: The text to be sent to search () or match () for matching. Re: The Pattern object used for search () or match. Pos: The index that the regular expression starts to search for in the text. The default value of the match () and seach () functions is 0 ,. Endpos: The index of the regular expression ending search in the text. The default value of the match () and search () functions is len (string ). Lastindex: Index of the last captured group. If no captured group exists, the value is None. Lastgroup: Name of the last captured group. If the group does not have an alias or is not captured, the value is None.

    Method:

      Group ([group1,…]) :
      Obtain one or more string intercepted by a group. If multiple parameters are specified, the string is returned as a tuple. Group1 can be numbered or alias. number 0 indicates the entire matched substring. If no parameter is set, group (0) is returned. If no string is intercepted, None is returned; the group that has been intercepted multiple times returns the last intercepted substring. An example of moderate complexity is as follows:

      M = re. match (r "(? P \ D +) \. (\ d *) ", '3. 14 ')

      After this match is executed, m. group (0) is 3.14, m. group (1) is '3', and m. group (2) is 14.

        Groups ([default]):
        Returns the string intercepted by all groups in the form of tuples. It is equivalent to calling group (1, 2 ,... Last ). Default indicates that the group that has not intercepted the string is replaced by this value. The default value is None. Groupdict ([default]):
        Returns a dictionary that uses the alias of an alias group as the key and the intercepted substring as the value. A group without an alias is not included. The meaning of default is the same as that of default. Start ([group]):
        Returns the starting index of the substring intercepted by the specified group in the string (index of the first character of the substring ). The default value of group is 0. End ([group]):
        Returns the ending index of the substring intercepted by the specified group in the string (index of the last character of the substring + 1 ). The default value of group is 0. Span ([group]):
        Returns (start (group), end (group )). Expand (template ):
        Place the matched group into the template and return the result.

        The group method is the most commonly used output of the preceding attributes and methods.

        Example

        Below are some of the above interfaces for code testing:

        Import re

        For name inre. findall ("name = \((.*?) \) "," Phone = 124567 name = (john) phone = 2345678 name = (tom )"):

        Print "got match name % s" % name

        Note that () is used as the Group Identifier. Because the data to be matched also contains () and we need the data in it, we need to escape, to restrict greedy matching in python *?, Make sure to match as little text as possible each time.

        K = re. search ("tm = \((.*?) \) "," Tt = 123 tm = (abc) vm = test tm = (CBA )")

        If k:

        Print "match tm is % s, % s, % d, % d, % s, % s" % (k. group (0), k. group (1), k. pos, k. endpos, k. string, k. re. pattern)

        Output:

        Match tmis tm = (abc), abc, 0, 32, tt = 123 tm = (abc) vm = test tm = (CBA), tm = \((.*?) \)

        Text = "JGood is abc handsome boy he is cool, clever, and so on ..."

        # Specify an additional group

        M = re. search (r "\ s (? P \ W +) \ s (? P \ W +) \ s ", text)

        If m:

        Print m. group (0), '\ t', m. group (1),' \ t', m. group (2)

        Print "groups % s, % s" % (m. groups (), m. lastindex, m. lastgroup,

        M. groupdict (). keys (), m. groupdict (). values (), m. string [m. start (2): m. end (2)])

        Print "string span is % s, % s" % (m. span (2) # Return (start (group), end (group )).

        Use? P <> the expression specifies the custom group name. Therefore, the result of m. groupdict is displayed.

        Output:

        Is abc

        Groups ('is', 'abc'), 2, sign2, ['sign1 ', 'sign2'], ['IS', 'abc'], abc

        String span is9, 12 s

        Reference link:

        Http://docs.python.org/release/2.2.3/lib/match-objects.html

        Http://www.cnblogs.com/huxi/archive/2010/07/04/1771073.html

        Http://www.cnblogs.com/sevenyuan/archive/2010/12/06/1898075.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.