Python generator and yield statement Usage Details, pythonyield

Before starting the course, I asked the students to enter a questionnaire that reflects their understanding of some concepts in Python. Some topics ("if/else control flow" or "define and use functions") are no problem for most students. However, there are some topics that most students have little or no contact with, especially "generator and yield keywords ". I guess this is true for most novice Python programmers.

It turns out that some people still cannot understand the generator and yield keywords after I have spent a lot of time. I want to improve this problem. In this article, I will explain what the yield keyword is, why it is useful, and how to use it.

Note: In recent years, the generator has become more and more powerful and has been added to the pep. In my next article, I will use coroutine, cooperative multitasking, and asynchronous I/O) (especially the implementation of the tulip prototype that GvR is studying) to introduce the real power of yield. But before that, we need to have a solid understanding of the generator and yield.

**Coroutine and subroutine**

When we call a common Python function, it generally starts execution from the first line of code of the function and ends with the return statement, exception, or function (which can be considered as an implicit return of None ). Once the function returns the control to the caller, it means the operation is complete. All the work done in the function and the data stored in the local variable will be lost. When you call this function again, everything will be created from scratch.

This is a standard process for functions discussed in computer programming. Such a function can return only one value. However, it is helpful to create a function that can generate a sequence. To do this, such functions need to be able to "save your work ".

As I said, the ability to "generate a sequence" is because our functions do not return as they normally do. The implicit meaning of return is that the function is returning control of Execution Code to the place where the function is called. The implicit meaning of "yield" is that the transfer of control is temporary and voluntary, and our function will revoke control in the future.

In Python, functions with this capability are called generators, which are very useful. The generator (and yield statements) was initially introduced to allow programmers to write code for the sequence of values. Previously, to implement something similar to a random number generator, a class or a module must be implemented to keep track of the status between each call while generating data. After the generator is introduced, this becomes very simple.

To better understand the problems solved by the generator, let's look at an example. When you understand this example, always remember the problem we need to solve: the sequence of generated values.

Note: In addition to Python, the simplest generator should be something called coroutines. In this article, I will use this term. Remember, in the concept of Python, The coroutine mentioned here is the generator. The formal term of Python is generator. coroutine is only for discussion and is not formally defined at the language level.

**Example: Interesting Prime Number**

Assume that your boss asks you to write a function. The input parameter is an int list and returns an iterative result containing prime 1.

Remember, Iterable is only the ability of an object to return a specific member each time.

You must think that "this is very simple" and soon write the following code:

def get_primes (input_list):
result_list = list ()
for element in input_list:
if is_prime (element):
result_list.append ()
return result_list
# Or better ...
def get_primes (input_list):
return (element for element in input_list if is_prime (element))
# Here is an implementation of is_prime ...
def is_prime (number):
if number> 1:
if number == 2:
return True
if number% 2 == 0:
return False
for current in range (3, int (math.sqrt (number) + 1), 2):
if number% current == 0:
return False
return True
return False

The above implementation of is_prime fully satisfies the requirements, so we told the boss that we had done it. She reported that our function is working normally, which is exactly what she wants.

**Process infinite sequences**

Oh, is that true? After a few days, the boss came over and told us that she had encountered a small problem: she planned to use our get_primes function for a large list containing numbers. In fact, this list is very large, and only the creation of this list will use up all the memory of the system. To this end, she wants to include a start parameter when calling the get_primes function and return all prime numbers greater than this parameter (maybe she wants to solve Project Euler problem 10 ).

Let's take a look at this new requirement. Obviously, it is impossible to simply modify get_primes. Naturally, we cannot return a list Of all prime numbers from start to infinity (although there are many useful applications that can be used to operate infinite sequences ). It seems that the possibility of using common functions to solve this problem is relatively slim.

Before we give up, let's determine the core obstacle that prevents us from writing functions that meet the boss's new needs. Through thinking, we can conclude that a function has only one chance to return the result, so it must return all the results at a time. It seems meaningless to come to such a conclusion; "isn't a function working like this?" we usually think so. However, if you don't know what to do, don't ask, "what if they're not like this ?"

Imagine what we can do if get_primes simply returns the next value, instead of returning all values at a time? We no longer need to create a list. There is no list, so there is no memory problem. The boss told us that she only needs to traverse the results and she does not know the difference in implementation.

Unfortunately, this seems unlikely. Even if we have magical functions that allow us to traverse from n to infinitely large, we will get stuck after returning the first value:

def get_primes (start):
for element in magical_infinite_range (start):
if is_prime (element):
return element
Suppose you call get_primes like this:
def solve_number_10 ():
# She * is * working on Project Euler # 10, I knew it!
total = 2
for next_prime in get_primes (3):
if next_prime <2000000:
total + = next_prime
else:
print (total)
return

Obviously, in get_primes, a value of 3 is input and returned in Row 3 of the function. Different from direct return, we need to prepare a value for the next request at exit.

But the function cannot do this. When a function is returned, all operations are completed. We can ensure that the function can be called again, but we cannot guarantee that, "Well, this time, the execution starts from row 4th at the time of last exit, rather than starting from the first line ". A function has only one entry: The function's 1st lines of code.

**Enter Generator**

This type of problem is so common that Python specifically adds a structure to solve it: generator. A generator will generate a value. Creating a generator is almost as simple as creating a generator function.

The definition of a generator function is similar to that of a common function. In addition to the yield keyword instead of return when it is used to generate a value. If a def subject contains yield, this function will automatically become a generator (even if it contains a return ). In addition to the above content, there is no additional step to create a generator.

The generator function returns the iterator of the generator. This may be the last time you see the term "generator iterator", because they are generally called "GENERATORS ". Note that the generator is a special iterator. As an iterator, the generator must define some methods, one of which is _ next __(). Like the iterator, we can use the next () function to obtain the next value.

To get the next value from the generator, we use the next () function, just like dealing with the iterator.

(Next () will worry about how to call the generator's _ next _ () method ). Since the generator is an iterator, it can be used in a for loop.

Whenever the generator is called, it returns a value to the caller. Use yield in the generator to complete this action (for example, yield 7 ). To remember what yield did, the simplest way is to treat it as a special return (add a little magic) dedicated to the generator function ). **

Yield is the return (plus a little magic) dedicated for generators ).

The following is a simple generator function:

>>> def simple_generator_function ():
>>> yield 1
>>> yield 2
>>> yield 3
Here are two simple ways to use it:
>>> for value in simple_generator_function ():
>>> print (value)
1
2
3
>>> our_generator = simple_generator_function ()
>>> next (our_generator)
1
>>> next (our_generator)
2
>>> next (our_generator)
3

**Magic?**

So where is the magic part? I'm glad you asked this question! When a generator function calls yield, the "State" of the generator function is frozen, the values of all variables are retained, and the location of the code to be executed in the next line is also recorded, until next () is called again (). Once next () is called again, the generator function starts from where it left last time. If you never call next (), the yield storage status will be ignored.

Let's rewrite the get_primes () function. This time we write it into a generator. Note that the magical_infinite_range function is no longer required. Using a simple while LOOP, we create our own infinite string columns.

Def get_primes (number ):

While True:

If is_prime (number ):

Yield number

Number + = 1

If the generator function calls return or is executed to the end of the function, a StopIteration exception occurs. This will notify the caller of next () that the generator has no next value (this is the action of the normal iterator ). This is why the while LOOP occurs in our get_primes () function. Without this while, when we call next () for the second time, the generator function will run to the end of the function, triggering the StopIteration exception. Once the generator value is used up, an error occurs when you call next (). Therefore, you can only use each generator once. The following code is incorrect:

>>> our_generator = simple_generator_function ()
>>> for value in our_generator:
>>> print (value)
>>> # Our generator has no next value ...
>>> print (next (our_generator))
Traceback (most recent call last):
File "<ipython-input-13-7e48a609051a>", line 1, in <module>
next (our_generator)
StopIteration
>>> # However, we can always create another generator
>>> # Just call the generator function again
>>> new_generator = simple_generator_function ()
>>> print (next (new_generator)) # works fine
1

Therefore, this while loop is used to ensure that the generator function will never be executed at the end of the function. You only need to call next () to generate a value. This is a common method for processing infinite sequences (Such generators are also common ).

Execution Process

Let's go back to the location where get_primes is called: solve_number_10.

def solve_number_10():
# She *is* working on Project Euler #10, I knew it!
total = 2
for next_prime in get_primes(3):
if next_prime < 2000000:
total += next_prime
else:
print(total)
return

Let's take a look at the call to get_primes in the for loop of solve_number_10, and observe how the first few elements are created to help us understand. When the for loop requests the first value from get_primes, we enter get_primes, which is no different from entering a common function.

- Go to the while loop of the third row
- Judge by the if condition (3 is a prime number)
- Use yield to return 3 and execution control to solve_number_10

Next, return to insolve_number_10:

- For loop to get the return value 3
- The for loop assigns it to next_prime
- Total with next_prime
- For Loop requests the next value from get_primes

This time, when we enter get_primes, It is not executed from the beginning. We continue to execute from row 5th, that is, the last time we left.

def get_primes(number):
while True:
if is_prime(number):
yield number
number += 1 # <<<<<<<<<<

The most important thing is that number also maintains the value when we called yield last time (for example, 3 ). Remember, yield will pass the value to the next () caller and save the "status" of the generator function ". Next, add number to 4, return to the beginning of the while loop, and continue to increase until the next Prime number (5) is obtained ). We once again return the value of number to the for loop of solve_number_10 through yield. This cycle will continue until the for loop ends (the obtained prime number is greater than 2,000,000 ).

**For example**

Added support for passing values to the generator in PEP 342. PEP 342 adds a new feature that enables the generator to implement in a single statement, generate a value (as before), accept a value, or generate a value at the same time and accept a value.

We use the previous function about prime numbers to demonstrate how to pass a value to the generator. This time, we no longer simply generate a prime number larger than a certain number, but instead find the smallest prime number larger than an equi-ratio of a certain number (for example, 10, we need to generate a prime number greater than 10,100,100... ). We start with get_primes:

def print_successive_primes (iterations, base = 10):
# Like normal functions, generator functions can take one parameter
prime_generator = get_primes (base)
# What to add here
for power in range (iterations):
# What to add here
def get_primes (number):
while True:
if is_prime (number):
# How to write here?
The last few lines of get_primes need to be explained. The yield keyword returns the value of number, and statements like other = yield foo mean, "return the value of foo, and return the value to the caller, and set the value of other to that value". You can "send" a value to the generator via the send method.
def get_primes (number):
while True:
if is_prime (number):
number = yield number
number + = 1

In this way, we can set different values for number each time we execute yield. Now we can complete the part of the Code missing from print_successive_primes:

def print_successive_primes(iterations, base=10):
prime_generator = get_primes(base)
prime_generator.send(None)
for power in range(iterations):
print(prime_generator.send(base ** power))

Note: First, we print generator. the result of sending is okay, because sending also returns the value generated by the generator through yield while sending data to the generator (just as the yield statement in the generator ).

Second, let's take a look at the line prime_generator.send (None, when you use send to "start" a generator (that is, the first line of code from the generator function is executed to the position of the first yield statement), you must send None. This is not hard to understand. According to the description just now, the generator has not reached the first yield statement. If we have a real value, no one will "receive" it. Once the generator starts, we can send data as above.

**Summary**

In the second half of this series, we will discuss some advanced usage and effects of yield. Yield has become one of the most powerful keywords in Python. Now we have a full understanding of how yield works. We have the necessary knowledge to learn more about yield application scenarios.

Whether you believe it or not, we have actually opened the corner of yield's powerful capabilities. For example, sending does work as we mentioned above, but in a simple sequence generation scenario like our example, sending is almost never used. Next I will post a piece of code to demonstrate the common use of send. I am not going to talk about how this code works and why it works like this. It will be a good warm-up for the second part.

import random
def get_data ():
"" "Returns 3 random numbers between 0 and 9" ""
return random.sample (range (10), 3)
def consume ():
"" "Show the dynamic average of the list of integers passed in each time" ""
running_sum = 0
data_items_seen = 0
while True:
data = yield
data_items_seen + = len (data)
running_sum + = sum (data)
print ('The running average is ()'. format (running_sum / float (data_items_seen)))
def produce (consumer):
"" "Generate a collection of sequences and pass to the consumer" "" "
while True:
data = get_data ()
print ('Produced (}'. format (data))
consumer.send (data)
yield
if __name__ == '__main__':
consumer = consume ()
consumer.send (None)
producer = produce (consumer)
for _ in range (10):
print ('Producing ...')
next (producer)

**Please remember ......**

I hope you can get some key ideas from the discussion in this article:

- Generator is used to generate a series of values.
- Yield is like the result returned by the generator function.
- The only thing yield does is to save the status of a generator function.
- Generator is a special type of iterator (iterator)
- Similar to the iterator, we can use next () to obtain the next value from generator.
- Some values are ignored by calling next () implicitly.

I hope this article will be helpful. If you have never heard of generator, I hope you can understand what it is and why it is useful and how to use it. If you are familiar with generator to some extent, I hope this article will help you clear some confusion about generator.