[Python 3 Series] Regular expressions

Source: Internet
Author: User

Regular expressions, referred to as regex, are descriptive methods of text patterns. For example, \d is a regular expression that represents a numeric character, that is, any number from 0 to 9.


Use steps

The functions of all regular expressions in Python are in the RE module.


The steps for using regular expressions in ▎python are as follows:

① Import the regular expression module with import re;

② creates a Regex object with the Re.compile () function.

③ passes the search () method of the Regex object to the string that you want to find. It returns a Match object.

④ calls the group () method of the match object, returning the string that actually matches the text.


Character classification

character type character meaning

\d 0 to 9 of any number

\d any character except a number from 0 to 9

\w any letter, number, or underscore (word)

\w any character except letters, numbers, and underscores

\s Space, tab, or line break (blank)

\s any character except spaces, tabs, and newline characters


Regular expression symbols

? Match 0 or one of the preceding groupings

* Match 0 or more previous groupings

+ Match one or more of the preceding groupings

| Matches one of several expressions

() use parentheses to create a "group"

{n} matches the previous grouping of n times

{n,} matches n or more preceding groupings

{, m} matches 0 times to M-Times before grouping

{N,m} matches groups that are at least n times, up to M times before

{n,m}? or *? or +? Non-greedy matching of the preceding groupings

^spam string must start with spam

spam$ string must end with spam

. Match all characters, except line breaks

\d, \w, and \s match numbers, words, and spaces

\d, \w, and \s match all characters in words, words, and spaces

[ABC] matches any character in square brackets

[^ABC] matches any character that is not in square brackets


Regular Expression methods

1, compile ()

Passes a string value to Re.compile (), representing the regular expression, which returns a Regex pattern object.

If you want to ignore whitespace characters and comments in a regular expression string, you can pass in the variable re. VERBOSE.

If it is not case-sensitive, you can pass in the re. IgnoreCase or RE.I.

If you want a period character to match a line break, you can pass in the re. Dotall.

The Re.compile () function takes only one value as its second argument, and can be combined with a pipe character to circumvent this limitation.

>>> Import re>>> phonenum=re.compile (R ' \d\d\d-\d\d\d\d\d\d\d\d ')


2. Group ()

The match object has a group () method that returns the text that is actually matched in the found string.

Adding parentheses creates a "group" in the regular expression. The first pair of parentheses in the regular expression string is group 1th. The second pair of parentheses is group 2nd. Passing an integer 1 or 2 to the group () Matching object method allows you to get different parts of the matched text. Passing a 0 or no parameter to the group () method returns the entire matched text. If you want to get all the groupings at once, use the groups () method.

>>> Import re>>> Phonenum=re.compile (R ' (\d\d\d)-(\d\d\d\d\d\d\d\d) ') >>> mo= Phonenum.search (' My number is 021-68000000 ') >>> print (Mo.group (0)) 021-68000000>>> print (Mo.group ( 1)) 021>>> print (Mo.group (2)) 68000000>>> print (Mo.groups ()) (' 021 ', ' 68000000 ')


3. Search ()

The search () method of the Regex object looks for the passed-in string, looking for all occurrences of the regular expression. If the regular expression pattern is not found in the string, the search () method returns none. If the pattern is found, the search () method returns a Match object.

>>> Import re>>> phonenum=re.compile (R ' \d\d\d-\d\d\d\d\d\d\d\d ') >>> Mo=phonenum.search ( ' My number is 021-68000000 ') >>> print (Mo.group ()) 021-68000000


4, FindAll ()

Search () returns a Match object containing the "first" matching text in the found string, and the FindAll () method returns a set of strings containing all the matches in the found string.


▎ as the return result of the FindAll () method, there are two points to note:

① If the call is on a regular expression that does not have a grouping, such as \d\d\d-\d\d\d-\d\d\d\d, a list of matching strings is returned, such as [' 123-456-7890 ', ' 000-000-0000 '].

② if called on a regular expression that has a grouping, for example (\d\d\d)-(\d\d\d)-(\d\d\d\d), returns a list of the tuples of a string, such as [(' 123 ', ' 456 ', ' 7890 '), (' 000 ', ' 000 ', ' 0000 ')]

>>> Import re>>> Phonenum=re.compile (R ' (\d\d\d) ') >>> phonenum.search (' 68000000 ') <_ Sre. Sre_match object; Span= (0, 3), match= ' 680 ' >>>> phonenum.findall (' 68000000 ') [' 680 ', ' 000 ']


5, Sub ()

The sub () method requires the passing of two parameters. The first argument is a string that replaces the found match. The second argument is a string, which is a regular expression. The sub () method returns the string after the replacement is complete.

>>> Import re>>> phonenum=re.compile (R ' 021-6800 ') >>> phonenum.sub (' 8800 ', ' My number is 021-68000000. ') ' My number is 88000000. '


Greed and non-greed

Python's regular expressions are "greedy" by default, which means that they match the longest string possible in the case of two semantics. The "non-greedy" version of the curly braces matches the shortest possible string, which is followed by a question mark after the closing curly brace.

A question mark may have two meanings in a regular expression: declaring a non-greedy match or representing an optional grouping. These two meanings are completely irrelevant.

>>> Import re>>> Phonenum01=re.compile (R ' (\d\d\d) {1,3} ') >>> Phonenum02=re.compile (R ' (\d \d\d) {1,3}? ') >>> mo01=phonenum01.search (' 68000000 ') >>> mo02=phonenum02.search (' 68000000 ') >>> Mo01.group () ' 680000 ' >>> mo02.group () ' 680 '


This article is from the "garbled Age" blog, please be sure to keep this source http://juispan.blog.51cto.com/943137/1949567

[Python 3 Series] Regular expression

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.