A detailed description of Python's iterators, generators, and related itertools packages

Source: Internet
Author: User
For mathematicians, the language of Python has a lot to appeal to them. For example: support for containers such as tuple, lists, and sets, using symbolic notation similar to traditional mathematics, as well as a list derivation that is similar to the structure of the set derivation and set (Set-builder notation) in mathematics.

Other features that attract math enthusiasts are iterator (iterators), generator (generators), and related itertools packages in Python. These tools help people to easily write out processes such as infinite sequences (infinite sequence), stochastic processes (stochastic processes), recursive relationships (recurrence relations), and composite structures ( Combinatorial structures) and other mathematical objects such as the elegant code. This article will cover some of my notes about iterators and generators, and some of the relevant experiences I've accumulated during my studies.

An iterator (Iterator) is an object that can iterate over a collection. In this way, it is not necessary to load the collection into memory, and because of this, the collection element can be almost infinite. You can find related documents in the "Iterator type (Iterator type)" section of the official Python document.

Let us be more precise with the description of the definition, if an object defines the __iter__ method, and this method needs to return an iterator, then this object is iterative (iterable). An iterator is an object that implements the two methods of __iter__ and next (__next__ in Python 3), which returns an Iterator object, which returns the next collection element of the iteration process. As far as I know, iterators always simply return themselves (self) in the __iter__ method, because they are their own iterators.

In general, you should avoid calling __iter__ and next methods directly. Instead, you should use the for or list-derivation (list comprehension) so that Python can automatically call these two methods for you. If you need to call them manually, use the Python built-in function, ITER and next, and pass the target iterator object or collection object as parameters to them. For example, if C is an iterative object, then you can use ITER (c) to access it, not c.__iter__ (), and, similarly, if a is an iterator object, use Next (a) instead of A.next () to access the next element. Similarly, there is the use of Len.

When it comes to Len, it's worth noting that there's no need to tangle with the definition of length for iterators. So they usually don't implement the __len__ method. If you need to calculate the length of the container, you have to calculate it manually, or use sum. At the end of this article, an example is given after the Itertools module.

There are some iterative objects that are not iterators, but other objects that are used as iterators. For example, a list object is an iterative object, but not an iterator (it implements the __iter__ but does not implement next). In the following example you can see how the list is listiterator using an iterator. It is also worth noting that the list well defines the length property, while Listiterator does not.

>>> a = [1, 2]>>> type (a)
  >>> type (ITER (a))
   >>> it = iter (a) >>> next (IT) 1>>> next (IT) 2>>> next (IT) Traceback (most recent call last): File "
      c8/> ", line 1, in 
     stopiteration>>> len (a) 2>>> len (IT) Traceback (most recent call last) : File "
      ", line 1, in 
         typeerror:object of type ' listiterator ' have no Len () 

The Python interpreter throws a Stopiteration exception when the iteration ends but continues to iterate over the access. However, as mentioned above, iterators can iterate over an infinite set, so for this iterator it must be the user's responsibility to ensure that no infinite loops are created, see the following example:

Class Count_iterator (object):  n = 0   def __iter__ (self):    return self   def next (self):    y = SELF.N    SELF.N + = 1    return y

Here's an example, notice that the last line attempts to convert an iterator object to list, which results in an infinite loop because the iterator object will not stop.

>>> counter = count_iterator () >>> Next (counter) 0>>> Next (counter) 1>>> next ( Counter) 2>>> Next (counter) 3>>> list (counter) # This would result in an infinite loop!

Finally, we will modify the above program: If an object does not have a __iter__ method but defines the __getitem__ method, the object is still iterative. In this case, when the Python built-in function Iter will return an iterator type corresponding to this object, and use the __getitem__ method to traverse all elements of the list. If the stopiteration or Indexerror exception is thrown, the iteration stops. Let's take a look at the following example:

Class SimpleList (object):  def __init__ (self, *items):    self.items = Items   def __getitem__ (self, i):    return Self.items[i]

Usage here:

>>> a = SimpleList (1, 2, 3) >>> it = iter (a) >>> next (IT) 1>>> next (IT) 2>>> NEX T (it) 3>>> next (IT) Traceback (most recent call last): File "
  ", line 1, in 

Now let's look at a more interesting example: using iterators to generate Hofstadter Q sequences based on initial conditions. Hofstadter the nested sequence for the first time in his book, G?del, Escher, Bach:an Eternal Golden Braid, and since then the question of proving that the sequence has been established for all n has begun. The following code uses an iterator to generate the Hofstadter sequence of a given n, defined as follows:

Q (n) =q (N-q (n-1)) +q (n? Q (n?2))

Given an initial condition, for example, qsequence ([1, 1]) will generate an H sequence. We use the stopiteration exception to indicate that the sequence cannot continue to be generated because a valid subscript index is required to generate the next element. For example, if the initial condition is [all], then the sequence generation will stop immediately.

Class Qsequence (object):  def __init__ (self, s):    SELF.S = s[:]   def next:    try:      q = self.s[- SELF.S[-1]] + self.s[-self.s[-2]]      self.s.append (q)      return q    except Indexerror:      raise Stopiteration ( )   def __iter__ (self):    return self   def current_state (self):    return SELF.S

Usage here:

>>> q = qsequence ([1, 1]) >>> next (q) 2>>> next (q) 3>>> [Next (Q) for __ in xrange (10)][3 , 4, 5, 5, 6, 6, 6, 8, 8, 8]

A builder (Generator) is a generator that is defined with a simpler function expression. More specifically, the yield expression is used inside the generator. The generator does not return a value using return, and returns the result using the yield expression when needed. The intrinsic mechanism of Python helps to remember the context of the current generator, that is, the current control flow and the values of local variables. Each time the generator is called, yield returns the next value during the iteration. The __iter__ method is implemented by default, meaning that any place where an iterator can be used can use the generator. The following example implements the same functionality as an example of the above iterator, but the code is more compact and more readable.

Def count_generator ():  n = 0 while  True:   yield n   n + = 1

Take a look at the usage:

>>> counter = count_generator () >>> counter
  >>> Next (counter) 0>>> Next (counter) 1>>> iter (counter)
   >>> iter (counter) is countertrue>>> type (

Now let's try to implement the Hofstadter's Q queue with a generator. This implementation is simple, but we can't implement a function like Current_state before. Because, as far as I know, it is not possible to directly access the state of the variables inside the generator, so functions such as current_state cannot be implemented (although there are data structures such as gi_frame.f_locals that can be done, this is a special implementation of CPython, Not a standard part of the language, so it is not recommended). If you need to access internal variables, one possible way is to return all the results through yield, and I'll leave this problem for practice.

def hofstadter_generator (s):  a = s[:] While  True:    try:      q = a[-a[-1]] + a[-a[-2]]      a.append (q)      Yield q    except Indexerror:      return

Note that there is a simple return statement at the end of the generator iteration process, but no data is returned. Internally, this throws a Stopiteration exception.

The next example comes from Groupon's face test. Here we first use two generators to implement the Bernoulli process, the process is an infinite sequence of random Boolean values, the probability of True is p and the probability of false is q=1-p. A von Neumann extractor is then implemented, which gets input from the Bernoulli process <> <1),并且返回另一个bernoulli process(p="0.5)。 (0

Import Random def bernoulli_process (p):  if p > 1.0 or P < 0.0:    raise ValueError ("p should be between 0.0 a nd 1.0. ")   While true:    yield Random.random () < P def von_neumann_extractor (process):  while true:    x, y = Process.next (), Process.next ()    if x! = y:      yield X

Finally, the generator is a very advantageous tool for generating stochastic dynamic systems. The following example shows how the famous tent map (tent map) dynamic system is implemented by the generator. (In a digression, see how the inaccuracy of numeric values begins to correlate and grow exponentially, a key feature of a dynamic system such as a tent map).

>>> def tent_map (Mu, x0): ...  x = X0  ... While True:    ... Yield x    ... x = mu * min (x, 1.0-x) ...>>>>>> t = Tent_map (2.0, 0.1) >>> for __ in xrange (+): ...  Print T.next () ... 0.4000000000230.8000000000470.3999999999070.7999999998140.4000000003730.8000000007450.399999998510.79999999702

Another similar example is the Collatz sequence.

def Collatz (n):  yield n while  n! = 1:   n = N/2 if n% 2 = = 0 Else 3 * n + 1   yield n

Note that in this example, we still do not manually throw the stopiteration exception because it will be thrown automatically when the control flow reaches the end of the function.

Please look at the usage:

>>> # If The Collatz conjecture is true and the list (Collatz (n)) for any n would ... # always terminate (though your m Achine might run out of memory first!) >>> List (Collatz (7)) [7, A, one, ten, 5, 8, 4, 2, 1]>>> list (Collatz (13)) "[ All, 5, 8, 4, 2, 1]>>> list (Collatz) [+,,, 5, 8, +, +, 4, 2, 1]>>>,.] ; List (Collatz (19)) [19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1]
Recursive Generators

Generators can be recursively recursive like other functions. Let's look at a self-implemented simple version of Itertools.permutations, which generates its full array by giving a list of the item (in practice, use itertools.permutations, which is faster). The basic idea is simple: for each element in the list, we place it in the first place by exchanging it with the first element of the list, and then recursively rearrange the rest of the list.

def permutations (items):  If len (items) = = 0:    yield []  else:    pi = items[:] For I in    xrange (len (pi)): C5/>pi[0], pi[i] = Pi[i], pi[0] for      p in permutations (pi[1:]):        yield [pi[0]] + p

>>> for P in permutations ([1, 2, 3]):   ... Print P ... [1, 2, 3] [1, 3, 2] [2, 1, 3] [2, 3, 1] [3, 1, 2] [3, 2, 1]
Generator Expressions

The builder expression allows you to define the generator through a simple, single-line declaration. This is very similar to the list derivation in Python, for example, the following code will define a generator to iterate over all the full squares. Note The return result of the generator expression is a generator type object that implements the next and __iter__ two methods.

>>> g = (x * * 2 for X in Itertools.count (1)) >>> G
   0x1029a5fa0>>>> N Ext (g) 1>>> next (g) 4>>> iter (g)
    0x1029a5fa0>>>> ITER (g) is gtrue >>> [G.next () for __ in xrange ()][9,
   +,-----------------------------, 121, 144]

You can also use the generator expression to implement the Bernoulli process, which is p=0.4 in this example. If a generator expression requires another iterator as a looping indicator, and the birthdate expression is used on an infinite sequence, then Itertools.count will be a good choice. Otherwise, xrange would be a good choice.

>>> g = (Random.random () < 0.4 for __ in Itertools.count ()) >>> [G.next () to __ in Xrange (Ten)][false, False, False, True, True, False, True, False, False, true]

As mentioned earlier, the generator expression can be used wherever an iterator is required as an argument. For example, we can calculate the sum of the first 10 full squares by the following code:

>>> SUM (x * * 2 for X in Xrange (10)) 285

More examples of generator expressions are given in the next section.
Itertools Module

The Itertools module provides a range of iterators to help users easily use permutations, combinations, Cartesian product, or other composite structures.

Before starting the following section, notice that all of the code given above is not optimized, and is just acting as an example. In practice, you should avoid arranging your own permutations unless you have a better idea, because the number of enumerations is incremented according to the order of magnitude.

Let's start with some interesting use cases. The first example is how to write a common pattern: loop through all subscript elements of a three-dimensional array, and iterate through the 0≤i<><>< p=""><><>

From Itertools import combinations, product n = 4d = 3 def visit (*indices):  Print Indices # Loop through all possible Indices of a arrayfor i in Xrange (n): for  J in Xrange (n): for    K in Xrange (n):      Visit (I, J, K) # equivalent Using Itertools.productfor indices in product (* ([Xrange (n)] * d)):  Visit (*indices) # Today loop through all indices 0 &L T;= I < J < K <= nfor i in Xrange (n):  for J in Xrange (i + 1, N): for    K in Xrange (j + 1, N):      Visit (i, J, K) # and equivalent using itertools.combinationsfor indices in combinations (xrange (n), D):  visit (*indices)

There are two advantages to using the enumerator provided with the Itertools module: The code can be done in a single line and easily scaled to a higher dimension. I did not compare the performance of the For method and the Itertools two methods, perhaps with a great relationship to N. If you want, please test your judgment yourself.

The second example is to do some interesting math problems: Use generator expressions, itertools.combinations, and itertools.permutations to calculate the number of reversed-order numbers in an arrangement, and calculate the sum of the number of reverse-order numbers in a list. As shown in Oeis A001809, the result of the summation is nearer to N!n (n-1)/4. It is more efficient to use this formula directly in practice than the code above, but I write this example to practice the use of the Itertools enumerator.

Import Itertoolsimport Math def inversion_number (A): "" "Return the number of  inversions in list A.  " " return sum (1 for x, y in Itertools.combinations (xrange (len (A)), 2) if a[x] > A[y]) def total_inversions (n): "" "  Retu RN Total number of inversions in permutations of N.  "" " return sum (Inversion_number (a) for A in Itertools.permutations (xrange (n)))

Use the following:

>>> [Total_inversions (n) for N in Xrange (][0), 0, 1, 9,, ten, 5400, 52920, 564480, 6531840] >>> [ma Th.factorial (N) * n * (n-1)/4 for N in xrange (10)][0, 0, 1, 9, 72, 600, 5400, 52920, 564480, 6531840]

In the third example, the recontres number is computed by the Brute-force counting method. Recontres number is defined here. First, we wrote a function that uses the generator expression to calculate the number of fixed points occurrences in an arrangement during a summation. Itertools.permutations and other generator expressions are then used in the summation to calculate the total number of permutations with n numbers and K fixed points. And then get the results. Of course, this method of implementation is inefficient and does not advocate the use of practical applications. Again, this is just an example of how the generator expression and itertools related functions are used.

def count_fixed_points (P): "" "Return the number of the  fixed points of p as a permutation.  " "  return sum (1 for X in P if p[x] = = x) def count_partial_derangements (N, k): "" "Returns the number of  permutations of n with k fixed points.  "" " return sum (1 for P in Itertools.permutations (Xrange (n)) if count_fixed_points (p) = = k)


# usage:>>> [Count_partial_derangements (6, I) for I in xrange (7)][265, 264, 135, 40, 15, 0, 1]
  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.