Learn Regular Expression notes (3), learn Regular Expression notes

Source: Internet
Author: User

Learn Regular Expression notes (3), learn Regular Expression notes
Python re module: core functions and methods 1. Use the compile () function to compile regular expressions

After importing the re module, compile the Regular Expression in compile (), for example, pattern = re. compile ('regular expression', re. s), and then we can use pattern for matching.

In compile, you can also include the module attributes, such as re. S, re. I, re. L, re. M, and re. X.

2. Matching object and group () and groups () Methods

There are two main methods for matching objects: group () and group (). The object returned by calling match () or search () is a matching object. group () either returns the entire matching object or the special sub-group as required.

Groups () returns only one tuples that contain a unique or all sub-groups. If no sub-group is required, when group () still returns the entire match, groups () returns an empty tuples.

3. Use the match () method to match strings

The match () function tries to match the pattern from the starting part of the string. If the match succeeds, a matching object is returned. If the match fails, None is returned. The group () method of the matching object can be used to display the successful match.

1 html = '14 # ('title', '123 ')
>>> re.match('foo', 'food on the table').group() 'foo' 
4. Use search () to search for a string)

The search () method is exactly the same as match,The difference is that search () uses its string parameters.To search for the first matching condition for the given regular expression mode at any location.

If a successful match is found, a matching object is returned. Otherwise, None is returned.

The difference between search () and match () is that search () Searches the middle part of a string.

>>> M = re. match ('foo', 'seafood ') # match failed >>> m = re. search ('foo', 'seafood ') # use search () Instead> if m is not None: m. group ()... 'foo' # search succeeded, but match failed. Search for foo in seafood.
5. repetition, special characters, and grouping

Use a regular expression that matches the email address as an example. (\ W + @ \ w + \. com). This regular expression can only match simple addresses.

To add support for host names before a domain name, such as www.xxx.com, you need to use ?, \ W + @ (\ w + \.)? \ W + \. com, so that (\ w + \.) is optional.

>>> pattern = '\w+@(\w+\.)?\w+\.com' >>> re.match(pattern, 'nobody@xxx.com').group() 'nobody@xxx.com' >>> re.match(pattern, 'nobody@www.xxx.com').group() 
'nobody@www.xxx.com'

This example is further extended to allow the existence of any number of intermediate subdomains. Put? Change. \ W + @ (\ w + \.) * \ w + \. com

>>> patt = '\w+@(\w+\.)*\w+\.com' >>> re.match(patt, 'nobody@www.xxx.yyy.zzz.com').group() 
'nobody@www.xxx.yyy.zzz.com'

Use parentheses to match and save sub-groups for later processing.

>>> M = re. match ('(\ w)-(\ d)', 'abc-123 ')> m. group () # complete match 'abc-123 '>>> m. group (1) # Sub-group 1 'abc'> m. group (2) # Sub-group 2 '20140901'> m. groups () # all sub-groups ('abc', '123 ')

Group () is usually used to display all matching parts in a normal way, but can also be used to obtain matched sub-groups. You can use the groups () method to obtain a tuples that contain all matching substrings.

6. search and replace using sub () and subn ()

There are two functions/methods for searching and replacing: sub () and subn (). The two are almost the same. They both replace all the matching regular expressions in a string in some form.

The part to be replaced is usually a string, but it may also be a function that returns a string to be replaced.

The difference between subn () and sub () Is that subn () returns a total number of replicas, the string after replacement and the number indicating the total number of replicas are returned as a tuples with two elements.

>>> re.sub('X', 'Mr. Smith', 'attn: X\n\nDear X,\n') 'attn: Mr. Smith\012\012Dear Mr. Smith,\012' >>> >>> re.subn('X', 'Mr. Smith', 'attn: X\n\nDear X,\n') ('attn: Mr. Smith\012\012Dear Mr. Smith,\012', 2) >>> >>> print(re.sub('X', 'Mr. Smith', 'attn: X\n\nDear X,\n'))attn: Mr. Smith Dear Mr. Smith, >>> re.sub('[ae]', 'X', 'abcdef') 'XbcdXf' >>> re.subn('[ae]', 'X', 'abcdef') ('XbcdXf', 2)  
7. Extended symbols

By using (? ILmsux) series options. You can specify one or more tags in a regular expression instead of using compile () or other re-module functions.

The following are some examples of using re. I/IGNORECASE. The last example implements multi-row mixing in re. M/MULTILINE:

>>> Re. findall (R '(? I) yes ', 'Yes? Yes. YES !! ')#(? I) Case Insensitive ['yes', 'yes', 'yes']> re. findall (R '(? I) th \ w + ', 'the quickest way is through this tunnel. ') ['the', 'pass', 'this']> re. findall (R '(? Im) (^ th [\ w] + )',"""... this line is the first ,... another line ,... that line, it's the best... ") ['This line is the first ', 'that Line']

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.