Python's detailed description of regular greed and non-greedy features

Source: Internet
Author: User
This article mainly introduced the Python regular expression in the greedy/non-greedy characteristics of the relevant data, the text through the sample code introduced in a very detailed, for everyone has a certain reference value, the need for friends below to see it together.

In this article, I'll summarize the greedy/non-greedy nature of regular expressions, which has been briefly introduced to the basics and captures of Python regex expressions.


By default, the regular expression will be greedy for matching. The so-called "greed", in fact, in a variety of length of the matching string, select the longer one. For example, the following regular expression is intended to be what the person said, but because of the "greedy" nature, there was a mismatch:

>>> sentence = "" said "why?" and I say "I don ' t know". "" " >>> Re.findall (R ' "(. *)" ', sentence) [' Why? ' and I say "I don\ ' t know ']

For example, the following examples illustrate the "greedy" nature of regular Expressions:

>>> re.findall (' hi* ', ' hiiiii ') [' HIIIII ']>>> re.findall (' hi{2,} ', ' hiiiii ') [' HIIIII ']>> > Re.findall (' hi{1,3} ', ' hiiiii ') [' HIII ']


When we expect the regular expression to be "non-greedy" to match, we need to make explicit syntax:

{2,5}? captures 2-5 times, but with fewer precedence matches

Here, the question mark? may be a bit confusing, because he already has his own meaning: the previous match appeared 0 or 1 times. In fact, just remember that a non-greedy match is represented when the question mark appears after the part of the regular expression that shows an indefinite number of times.

Or the above example, with a non-greedy match, the result is as follows:

>>> re.findall (' hi*? ', ' hiiiii ') [' H ']>>> re.findall (' Hi{2,}? ', ' hiiiii ') [' Hii ']>>> Re.findall (' hi{1,3}? ', ' hiiiii ') [' Hi ']

In another example, the use of a non-greedy match results in the following:

>>> sentence = "" said "why?" and I say "I don ' t know". "" " >>> Re.findall (R ' "(. *)" ', sentence) [' Why ', ' I don ' t know "]

Capture and non-greedy

Strictly speaking, this part is not a non-greedy feature. But because their behavior is similar to non-greed, they are put together for the convenience of memory.

(?=abc) captures, but does not consume characters, and matches the ABC

(?!abc)capture, do not consume, and do not match ABC

In the regular expression matching process, there is actually a "consumption of characters" process, that is, once a character in the matching process is retrieved (consumed), the subsequent match will not retrieve this character.

What's the use of knowing this feature? It is also illustrated with examples. For example, we want to find out more than 1 occurrences of a word in a string:

>>> sentence = "Oh What a day, what a lovely day!" >>> re.findall (R ' \b (\w+) \b.*\b\1\b ', sentence) [' What ']

Such a regular expression is obviously unable to complete the task. Why is it? The reason is that when the first (\w+) matches to what, and then the \1 also matches to the second what, the "Oh What a Days, what" string has been consumed by the regular expression, so the subsequent match will start directly after the second. Naturally, it is only possible to find a word that appears two times.

Then the solution is related to the (? =abc) syntax mentioned above. Such a syntax can be matched in groups without consuming strings! Therefore, the correct way to write should be:

>>> re.findall (R ' \b (\w+) \b (? =.*\b\1\b) ', sentence) [' What ', ' a ', ' Day ']

If we need to match a word that contains at least two different letters, you can use (?! ABC) Syntax:

>>> (R ' ([A-z]). * (?! \1) [A-z] ', ' AA ', re. IGNORECASE) >>> (R ' ([A-z]). * (?! \1) [A-z] ', ' AB ', re. IGNORECASE) <_sre. Sre_match object; span= (0, 2), match= ' AB ' >


1. Python Free video tutorial

2. Python Learning Manual

3. Geek College Python video tutorial

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.