A detailed explanation of the usage of the generator and yield statements in Python _python

Source: Internet
Author: User
Tags generator in python

Before starting the course, I asked the students to fill out a questionnaire that reflected their understanding of some of the concepts in Python. Some topics ("If/else Control Flow" or "definition and use functions") are not problematic for most students. But there are a few topics where most students have little or no contact at all, especially "generator and yield keywords". I guess that's true for most novice python programmers as well.

It turns out that after a lot of work, some people still don't understand the generator and yield keywords. I want to improve the problem. In this article, I'll explain what the yield keyword is, why it's useful, and how to use it.

Note: In recent years, the generator has become more and more powerful, and it has been added to THE PEP. In my next article, I'll go through the coroutine, collaborative multitasking (cooperative multitasking), and asynchronous IO (asynchronous I/O) (especially the implementation of the "GVR" prototype being researched by Tulip) To introduce the true power of yield. But until then, we have a solid understanding of generators and yield.

Routines and subroutines

When we call an ordinary Python function, we typically start with the first line of the function, ending with the return statement, the exception, or the end of the function (which can be thought of as implicitly returning none). Once the function returns control to the caller, it means all ends. All the work done in the function and the data saved in the local variable will be lost. When you call this function again, everything is created from scratch.

This is a standard process for the functions discussed in computer programming. Such a function can only return a value, but sometimes it is helpful to create a function that produces a sequence. To do this, this function needs to be able to "save your own work".

As I said, being able to "produce a sequence" is because our function does not return as usual. Return implicitly means that the function is returning control of the execution code to the place where the function is invoked. The implied meaning of "yield" is that the transfer of control is temporary and voluntary, and our function will reclaim control in the future.


In Python, a "function" with this ability is called a builder, and it is very useful. The generator (and yield statement) was first introduced to allow programmers to write code that produces a sequence of values more simply. In the past, to implement something like a random number generator, you would need to implement a class or a module that keeps track of the state between each call while generating the data. After the generator is introduced, this becomes very simple.

To better understand the problems that the generator solves, let's look at an example. In the process of understanding this example, always remember the problem we need to solve: generate a sequence of values.

Note: In addition to Python, the simplest generator should be something called a coprocessor (coroutines). In this article, I will use this term. Keep in mind, in the Python concept, the coprocessor mentioned here is the generator. The official term for Python is the generator; the coprocessor is just for discussion and is not formally defined at the language level.

Example: Interesting prime numbers

Suppose your boss lets you write a function, the input parameter is a list of int, and returns a result that can be iterated with a prime number of 1.

Remember, an iterator (iterable) is just the ability for an object to return a specific member at a time.

You must think "it's easy," and then quickly write the following code:

def get_primes (input_list):
  result_list = list () for
  element in Input_list:
    if Is_prime (element):
      Result_list.append () return
 
  result_list
 
# or better ...
 
def get_primes (input_list): Return
  (element to element in Input_list if Is_prime (Element)
 
# Below is an implementation of is_prime. ..
 
def is_prime (number):
  If number > 1:
    If number = 2: return
      True
    if number% 2 = 0: return
      Fa LSE for
    Range (3, int (math.sqrt (number) + 1), 2): If number
      % current = 0: return
        False
    ret Urn True return
  False

The implementation of the is_prime above fully satisfied the requirements, so we told the boss that it was done. She feeds back that our functions are working properly, exactly what she wants.

Handling Infinite Sequences

Oh, is that so? After a few days, the boss came to tell us that she had some minor problems: she was going to use our get_primes function for a large list containing numbers. In fact, this list is very large, just create this list will run out of all the memory of the system. To do this, she wants to be able to take a start argument on the call to the Get_primes function and return all the primes that are larger than this argument (maybe she's going to fix Project Euler problem 10).

Let's take a look at this new requirement, obviously simply modifying the get_primes is not possible. Naturally, we cannot return a list containing all the primes from start to infinity (although there are many useful applications that can be used to manipulate infinite sequences). The likelihood of dealing with the problem with a normal function seems slim.


Before we give up, let's identify the core obstacles that prevent us from writing a function that satisfies the new needs of our boss. By thinking, we get the conclusion that the function has only one chance to return the result, and therefore must return all the results at once. There seems to be no point in drawing such a conclusion: "The function does not work that way," as we usually think. But, do not learn, do not ask do not know, "if they are not so?" ”

Imagine what we would do if Get_primes could simply return the next value instead of returning all the values at once? We no longer need to create a list. Without a list, there is no memory problem. As the boss told us, she only needs to traverse the results, she will not know our implementation of the difference.

Unfortunately, this does not seem likely. Even if we have magical functions that allow us to traverse from N to infinity, we will also be stuck after returning the first value:

def get_primes (start): for
  element in Magical_infinite_range (start):
    if Is_prime (element):
      return
The element assumes this way to invoke Get_primes:
 
def solve_number_10 ():
  # She *is* working on Project Euler #10, I knew it!
  Total = 2 for
  next_prime in Get_primes (3):
    if Next_prime < 2000000: Total
      + = Next_prime
    Else:
      Print (total) return
      

Obviously, in get_primes, the input equals 3, and it returns on line 4th of the function. Unlike direct return, we need to prepare a value for the next request when exiting.

But the function does not do that. When the function returns, it means that it is all done. We guarantee that the function can be invoked again, but we can't guarantee that, "Well, this time from line 4th at the time of the last exit, instead of the regular start of the first line." The function has only a single entry: The 1th line of the function's code.

walk into the builder

Such problems are so common that python specifically joins a structure to solve it: generators. A generator "generates" a value. Creating a generator is almost as simple as the principle of a generator function.

The definition of a generator function is much like a normal function, except when it is to generate a value, use the yield keyword instead of return. If the body of a def contains yield, the function automatically becomes a generator (even if it contains a return). In addition to the above, there is no extra step in creating a generator.

The builder function returns the iterator for the builder. This may be the last time you see the term "generator iterator" because they are often referred to as "generators." Note that the generator is a special kind of iterator. As an iterator, the builder must define some methods (method), one of which is __next__ (). As with iterators, we can use the next () function to get the next value.

To get the next value from the generator, we use the next () function as if it were an iterator.

(The next () will worry about calling the builder's __next__ () method). Now that the generator is an iterator, it can be used in a for loop.

Whenever the generator is invoked, it returns a value to the caller. Use yield inside the generator to complete this action (for example, yield 7). The easiest way to remember what yield is doing is to treat it as a special return (plus little magic) that is used specifically for the generator function. **

Yield is the return (plus little magic) that is specifically for the generator.

The following is a simple generator function:

>>> def simple_generator_function ():
>>>  yield 1
>>>  yield 2
> >>  yield 3
Here are two simple ways to use it: >>> for
 
value in Simple_generator_function ():
>> >   Print (value)
1
2
3
>>> our_generator = simple_generator_function ()
>>> Next (our_generator)
1
>>> Next (our_generator)
2
>>> Next (our_ Generator)
3

Magic?

So where is the magical part? I'm glad you asked the question! When a generator function calls yield, the "state" of the generator function is frozen, the value of all the variables is preserved, and the position of the next line of code to be executed is recorded until you call Next again (). Once the next () is called again, the generator function starts where it last left off. If you never call next (), the yield save state is ignored.

Let's rewrite the Get_primes () function, and this time we'll write it as a generator. Note that we no longer need the Magical_infinite_range function. Using a simple while loop, we create our own infinite string columns.

def get_primes (number):
While True:
If Is_prime (number):
Yield number
Number = 1

If the generator function calls return, or executes to the end of the function, a stopiteration exception appears. This notifies the caller of next () that the generator has no next value (this is the behavior of the normal iterator). This is also why this while loop appears in our Get_primes () function. Without this while, the generator function executes to the end of the function, triggering the stopiteration exception, the second time we call next. Once the value of the generator is used up, and then call next () there will be an error, so you can only use each generator once. The following code is wrong:


>>> our_generator = simple_generator_function ()
>>> for value in Our_generator:
>> >   Print (value)
 
>>> # Our generator has no next value ...
>>> print (Next (our_generator))
Traceback (most recent call last):
 File "< Ipython-input-13-7e48a609051a> ", line 1, in <module>
  next (our_generator)
stopiteration
 
> >> # However, we can always create a generator
>>> # simply call the generator function again
 
>>> new_generator = simple_generator_ function ()
>>> print (Next (new_generator)) # works fine
1

Therefore, this while loop is used to ensure that the generator function never executes to the end of the function. Just call next () The generator generates a value. This is a common method of dealing with infinite sequences (this type of generator is also common).

Execution process

Let's go back to the place where we call Get_primes: Solve_number_10.

Def solve_number_10 ():
  # She *is* working on Project Euler #10, I knew it!
  Total = 2 for
  next_prime in Get_primes (3):
    if Next_prime < 2000000: Total
      + = Next_prime
    Else:
      Print (total) return
      

Let's take a look at the call to Get_primes in the solve_number_10 for loop, and see how the first few elements were created to help us understand. When the For loop requests the first value from the Get_primes, we enter the Get_primes, which is no different from entering the normal function.

    • The while loop into the third row
    • Stop in If condition judgment (3 is prime)
    • Return 3 and executive control to SOLVE_NUMBER_10 via yield

Next, back to Insolve_number_10:

    • The For loop gets the return value 3
    • The For loop assigns it to the Next_prime
    • Total Plus Next_prime
    • The For loop requests the next value from the Get_primes

This time, when we entered the get_primes, we did not start from the beginning, and we continued from line 5th, the last place we left.

def get_primes (number): While
  True:
    if Is_prime (number):
      yield number
    = 1 # <<<< <<<<<<

Crucially, number also retains the value of the last time we called yield (for example, 3). Remember, yield passes the value to the caller of next () and also saves the "state" of the builder function. Next, number is added to 4, back to the start of the while loop, and then continues to increase until the next prime number (5) is obtained. Once again, we return the value of number through yield to the solve_number_10 for loop. This cycle is performed until the For loop ends (the resulting prime is greater than 2,000,000).

More to the

In PEP 342, support was added to pass the value to the generator. PEP 342 adds a new feature that allows the generator to be implemented in a single statement, generating a value (as before), accepting a value, or generating a value at the same time and accepting a value.

We'll show you how to pass a value to the generator by using the previous function on primes. This time, instead of simply generating primes that are larger than a certain number, we find the smallest primes that are larger than a certain number of geometric (for example, 10, we want to generate more than the 10,100,1000,10000 ...). A large minimum prime number). We start with the Get_primes:


def print_successive_primes (iterations, base=10):
  # like a normal function, a generator function can accept a parameter
  
  Prime_generator = Get_primes (base
  # Here's what to add to it later. For power in
  range (iterations):
    # Here's what we'll add later.
 
def get_primes (number): While
  True:
    if Is_prime (number):
    # How do I write here?
 The latter lines of the get_primes need to be explained in particular. The yield keyword returns the value of number, and a statement like other = yield Foo means, "returns the value of Foo, which is returned to the caller while setting other values to that value." You can send a value to the generator using the Send method.
 
def get_primes (number): While
  True:
    if Is_prime (number): number
      = yield number number =
    1

In this way, we can set a different value for number each time the yield is executed. Now we can get the part of the code missing from Print_successive_primes:

def print_successive_primes (iterations, base=10):
  prime_generator = Get_primes (base)
  prime_ Generator.send (None) for power in
  range (iterations):
    print (Prime_generator.send (base * * Power))

Here are two points to note: First, we print the result of Generator.send, which is fine, because send sends the data to the generator and returns the value generated by the generator through yield (as the yield statement in the builder does).

2nd, look at the Prime_generator.send (none) line, and when you use Send to "start" a generator (that is, from the first line of the generator function to the position of the first yield statement), you must send none. This is not difficult to understand, according to the description just now, the generator has not come to the first yield statement, if we have a real value, then no one to "receive" it. Once the generator is started, we can send the data as above.

Review

In the second half of this series, we'll discuss some of the advanced uses of yield and their effects. Yield has become one of the most powerful keywords in python. Now that we have a good understanding of how yield works, we already have the necessary knowledge to understand some of the more "obscure" scenarios yield.

Believe it or not, we've just uncovered a yield of power. For example, send does work as described earlier, but in a scenario like ours, where a simple sequence is generated, send is almost never used. I'll post a piece of code that shows how the send is usually used. I'm not going to say much about how this code works and why it works, but it will be a good warm-up for the second part.

Import Random
 
def get_data (): "" "
  returns 3 random numbers from 0 to 9" "Return
  Random.sample (range (3)
 
def consume (): ""
  displays the dynamic average for each incoming list of integers "" "
  running_sum = 0
  data_items_seen = 0 while
 
  True:
    data = yield
    Data_ Items_seen + = len (data)
    running_sum + = SUM (data)
    print (' The running average is {} '. Format running_sum/float ( Data_items_seen))
 
def Produce (consumer): ""
  produces a collection of sequences passed to the consumption function (consumer) "" While
  True:
    data = Get_data ()
    print (' produced {} '. Format (data))
    consumer.send (data)
    yield
 
if __name__ = ' __main_ _ ':
  consumer = consume ()
  consumer.send (None)
  producer = Produce (consumer) for
 
  _ in range (10):
    print (' producing ... ')
    next (producer)


Please remember ...

I hope you can get some key ideas from the discussion in this article:

    • Generator is used to produce a series of values.
    • Yield is like the return result of the generator function.
    • Yield the only other thing you can do is save the state of a generator function
    • Generator is a special type of iterator (iterator)
    • Similar to iterators, we can get the next value from generator by using next ()
    • Ignore some values by implicitly calling next ()

I hope this article is useful. If you have never heard of generator, I hope you can now understand what it is and why it is useful, and understand how to use it. If you are already familiar with generator in some way, I hope this article will now allow you to clear some of the confusion about generator.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.