Advanced usage of the iterator and generator in Python

Source: Internet
Author: User
This article mainly introduces the advanced usage analysis of the iterator and generator in Python. the generator is a type of iterator in Python. here we will talk about generating expressions, chain generators, and other in-depth content, for more information, see Iterator

An iterator is an object attached to the iteration protocol. it basically means that it has a next method. when called, it returns the next item in the sequence. When no project can be returned, a (raise) StopIteration exception is thrown.

The iteration object allows a loop. It retains the state (location) of a single iteration, or from another perspective, each loop sequence requires an iteration object. This means that we can iterate the same sequence more than once at the same time. Separating iteration logic and sequence gives us more ways to iterate.

Calling the _ iter _ method of a container to create an iteration object is the most direct way to grasp the iterator. The iter function saves us some buttons.

>>> nums = [1,2,3]   # note that ... varies: these are different objects>>> iter(nums)              
  >>> nums.__iter__()           
   >>> nums.__reversed__()         
    >>> it = iter(nums)>>> next(it)      # next(obj) simply calls>>>>>> next(it)3>>> next(it)Traceback (most recent call last): File "
     ", line 1, in 

When used in a loop, StopIteration is accepted and the loop is stopped. But through explicit invocation, we can see that once the iterator element is exhausted, accessing it will cause an exception.

The _ iter _ method is also used in the for... in loop. This allows us to start iteration of a sequence transparently. But if we already have an iterator, we want to use them in the for loop. To achieve this, the iterator has a method _ iter _ Besides next to return the iterator itself (self ).

Python supports iterators everywhere: all sequences and unordered containers in the standard library are supported. This concept is also extended to other things: for example, file objects support row iteration.

>>> f = open('/etc/fstab')>>> f is f.__iter__()True

File itself is an iterator, and its _ iter _ method does not create a separate object: only the sequential reading of a single thread is allowed.

Generate expressions
The second method to create an iteration object is to generate the expression and list comprehension. To increase clarity, the generated expression is always enclosed in parentheses or expressions. If parentheses are used, a generator iterator is created ). If it is square brackets, this process is 'short-circuit' and we get a list.

>>> (i for i in nums)          
   at 0x...>>>> [i for i in nums][1, 2, 3]>>> list(i for i in nums)[1, 2, 3]

In Python 2.7 and 3.x, the list expression syntax is extended to the dictionary and set expression. A set is created when the generated expression is enclosed by braces. A dictionary dict is created when the expression contains key-value pairs in the form of key: value:

>>> {i for i in range(3)}  set([0, 1, 2])>>> {i:i**2 for i in range(3)}  {0: 0, 1: 1, 2: 4}

If you are unfortunately trapped in the old Python version, this syntax is a bit bad:

>>> set(i for i in 'abc')set(['a', 'c', 'b'])>>> dict((i, ord(i)) for i in 'abc'){'a': 97, 'c': 99, 'b': 98}

It is quite simple to generate expressions. There is only one trap worth mentioning: index variable (I) leakage in Python versions earlier than 3.


A generator is a function that generates a column of results rather than a single value.

The third way to create an iteration object is to call the generator function. A generator is a function that contains the keyword yield. It is worth noting that the emergence of this keyword completely changes the nature of the function: yield statements do not have to be invoke or even accessible. But the function is turned into a generator. When a function is called, its commands are executed. When a generator is called, the execution stops before the first command. Create a generator object attached to the iteration protocol by calling the generator. Like conventional functions, concurrent and recursive calls are allowed.
When next is called, the function is executed to the first yield. Each time a yield statement is run, a value returned as next is obtained. after the yield statement is executed, the function execution is stopped.

>>> def f():...  yield 1...  yield 2>>> f()                  
  >>> gen = f()>>>>>>>>> (most recent call last): File "
   ", line 1, in 

Let's traverse the entire process of calling a single generator function.

>>> def f():...  print("-- start --")...  yield 3...  print("-- middle --")...  yield 4...  print("-- finished --")>>> gen = f()>>> next(gen)-- start --3>>> next(gen)-- middle --4>>> next(gen)              -- finished --Traceback (most recent call last): ...StopIteration

Compared to running f () in conventional functions, gen is assigned a value without executing any statement in the function body. Only when gen. next () is called by next until the first yield part of the statement is executed. The second statement prints -- middle -- and stops executing the second yield statement. The third next prints -- finished -- and ends at the end of the function. because there is no yield, an exception is thrown.

After yield is used, what happens after the function is returned to the caller? The status of each generator is stored in the generator object. From this point of view, it seems that the generator function runs in a separate thread, but this is only an illusion that the execution is strictly single-threaded, however, the interpreter retains the state of the request stored in the next value.

Why is the generator useful? As mentioned in the iterator section, generator functions are only one way to create iterative objects. Everything that can be completed by yield statements can also be completed by the next method. However, using functions gives the interpreter the advantage of creating an iterator magically. A function can be much shorter than the class definition that requires the next and _ iter _ methods. More importantly, compared to the instance attribute that has to be passed between consecutive next calls of the iteration object, the generator author can simply understand the statements limited to local variables.

Another question is, why is the iterator useful? When an iterator is used to drive a loop, the loop becomes simple. The code initialization status of the iterator determines whether the loop ends and finds the value extracted from the next place. This highlights the loop body-the most noteworthy part. In addition, the iterator code can be reused elsewhere.

Bidirectional communication
Each yield statement passes a value to the caller. This is why PEP 255 introduces the generator (implemented in Python2.2 ). However, communication in the opposite direction is also useful. An obvious method is some external (extern) statements, global variables, or shared mutable objects. By converting the previously boring yield statement into an expression, direct communication becomes a reality because of PEP 342 (implemented in 2.5 ). When the generator resumes execution after the yield statement, the caller can call a method for the generator object, or pass a value to the generator, and then return the result through the yield statement, or inject an exception to the generator in a different way.

The first new method is send (value), which is similar to next (), but the value is passed into the generator as the value of the yield expression. In fact, g. next () and g. send (None) are equivalent.

The second new method is throw (type, value = None, traceback = None), which is equivalent to the yield statement.

raise type, value, traceback

Unlike raise (which raises an exception immediately from the execution point), throw () first restores the generator and then only raises an exception. A single throw is used because it means to place exceptions in other locations and is related to exceptions in other languages.

What happens when an exception in the generator is thrown? It can be explicitly triggered. when executing some statements, it can be injected to the yield statement through the throw () method. In any case, exceptions are transmitted in the standard mode: they can be captured by the generator T and finally, or the generator is aborted and passed to the caller.

For integrity, it is worth mentioning that the generator iterator also has the close () method, which is used to enable the generator that can provide more values to stop immediately. It uses the _ del _ method of the generator to destroy objects that retain the state of the generator.

Let's define a generator that only prints the items passed through the send and throw methods.

>>> import itertools>>> def g():...   print '--start--'...   for i in itertools.count():...     print '--yielding %i--' % i...     try:...       ans = yield i...     except GeneratorExit:...       print '--closing--'...       raise...     except Exception as e:...       print '--yield raised %r--' % e...     else:...       print '--yield returned %s--' % ans>>> it = g()>>> next(it)--start----yielding 0--0>>> it.send(11)--yield returned 11----yielding 1--1>>> it.throw(IndexError)--yield raised IndexError()----yielding 2--2>>> it.close()--closing--

Note: next or _ next __?

In Python 2.x, the iterator method that accepts the next value is next, which is explicitly called through the global function next, that is, it should call _ next __. Just like the global function iter calls _ iter __. This inconsistency is fixed in Python 3. x, and it. next becomes it. _ next __. It is more complicated for other generator methods -- send and throw because they are not implicitly called by the interpreter. However, it is recommended that the syntax extension enable continue to include a parameter that will be passed to the send in the loop iterator. If this extension is accepted, gen. send may be changed to gen. _ send __. The last builder method close is obviously incorrectly named because it has been called implicitly.

Chain generator
Note: This is the preview of PEP 380 (not implemented yet, but accepted by Python3.3)

For example, if we are writing a generator, we want yield to be the second generator -- a subgenerator -- to generate the number. If you only want to generate (yield) values, you can do it effortlessly through loops:

subgen = some_other_generator()for v in subgen:  yield v

However, if the sub-generator needs to call send (), throw (), close () and the caller to properly interact with each other, the process is complicated. The yield statement has to ensure "debugging" The generator function by using a try... pipeline T... finally structure similar to the one defined in the previous section. This code is provided in PEP 380 and is now enough to come up with the new syntax that will be introduced in Python 3.3:

yield from some_other_generator()

Like the explicit loop call above, the value is repeatedly generated from some_other_generator until no value can be generated, but the send, throw, and close requests are still forwarded to the subgenerator.

For more articles about the advanced usage of iterators and generators in Python, please refer to PHP Chinese network!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.