Python3 Regular Expressions (2) and python3 Regular Expressions

Source: Internet
Author: User

Python3 Regular Expressions (2) and python3 Regular Expressions

Previous Article: Explanation of Python3 regular expression (1)

Https://docs.python.org/3.4/howto/regex.html

The blogger made some comments and modifications to this question ^_^

Annotation: The re module is written in C language, so the efficiency is much higher than the normal string method. Compiling Regular Expressions (compile) is also to further improve the efficiency; later we will often mention the "pattern", which refers to the pattern object compiled by a regular expression.

Compile regular expressions

Regular Expressions are compiled as pattern objects. This object has various methods for you to operate on strings, such as searching for pattern matching or executing string replacement.

Re. compile () can also accept the flags parameter, which is used to enable various special functions and syntax changes. We will introduce it one by one later.

Now let's look at a simple example:

The regular expression is passed to re. compile () as a string parameter (). Because regular expressions are not the core part of Python, they do not provide special syntax support. Therefore, regular expressions can only be expressed as strings. Some applications do not need to use regular expressions at all, so friends in the Python community think it is not necessary to include them into the core of Python. On the contrary, the re module is only included in Python as a C extension module, just like the socket module and zlib module.

Use strings to indicate that regular expressions maintain the consistent style of Python, but this also has some negative effects. Let's talk about it below.

Troublesome backslash

As we mentioned in the previous article, regular expressions use the '\' character to make some common characters have special capabilities (for example, \ d indicates matching any decimal number ), you can also deprive the system of special characters (for example, \ [indicates matching the left brace '['). This will conflict with characters that implement the same functions in Python strings.

Note: You can understand the example ~

Now, you need to use a regular expression in the LaTeX file to match the string '\ section '. Because the backslash is a special character to be matched, you need to add a backslash before it to deprive it of its special functions. Therefore, we will write the characters in the regular expression as '\ section '.

But do not forget that Python also uses a backslash to express special meaning in strings. Therefore, if we want to completely pass '\ section' to re. compile (), we need to add two backslashes again ......

Matching characters Matching stage
\ Section String to be matched
\ Section The regular expression uses '\' to indicate matching characters '\'
\\\\ Section Unfortunately, the Python string also uses '\' to represent the character '\'

In short, to match the backslash character, we need to use four backslashes in the string. Therefore, frequent use of backslash in regular expressions can cause a backslash storm, making your string extremely difficult.

The solution is to use the original Python string to represent the regular expression (that is, add r to the front of the string, remember it ...):

Regular string Original string
"AB *" R "AB *"
"\\\\ Section" R "\ section"
"\ W + \ s + \ 1" R "\ w + \ s + \ 1"

Annotation: we strongly recommend that you use the original string to express the regular expression.

Implement matching

After you compile the regular expression, you get a schema object. What can you do with it? Pattern objects have many methods and attributes. The most important ones are listed below:

Method Function
Match () Determines whether a regular expression matches a string from the start.
Search () Traverse the string and find the first position matched by the regular expression.
Findall () Traverses the string, finds all locations matched by the regular expression, and returns the result in the form of a list.
Finditer () Traverses the string, finds all locations matched by the regular expression, and returns the result as an iterator.

If no match is found, match () and search () will return None; if the match is successful, a matching object will be returned, containing all matching information: for example, where to start, where to end, and matched substring.

Next we will explain it step by step:

Now, you can try to use the regular expression [a-z] + to match various strings.

For example:

Because + indicates matching once or multiple times, empty strings cannot be matched. Therefore, match () returns None.

Let's try another matching string:

In this example, match () returns a matching object, which is stored in the Variable m for future use.

Next, let's take a look at the information in the matching object. Matching objects contain many methods and attributes. The following are the most important:

Method Function
Group () Returns the matched string.
Start () Returns the starting position of the match.
End () Returns the matched end position.
Span () Returns a tuple to indicate the matching position (START, end)

You can see:

Because match () only checks whether the regular expression matches the start position of the string, start () always returns 0.

However, the search () method may be different:

In practice, the most common method is to store the matching object in a local variable and check whether the returned value is None.

The format is usually as follows:

p = re.compile(...)m = p.match('String goes here')if m:    print('Match found:', m.group())else:    print('No match')

There are two methods to return all matching results, one is findall () and the other is finditer ().

Findall () returns a list:

Findall () needs to create a list before returning, while finditer () returns the matching object as an iterator:

Annotation: if the list is large, the efficiency of the returned iterator is much higher.

(This article is complete)

Next article: Python3 Regular Expression (3)

If you like this article, please use the "Comments" below to encourage me. ^_^

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.