Python iterators and Builders

Source: Internet
Author: User
Tags iterable
Example

The usual, previous code:

def add (S, x): return s + Xdef Gen (): For I in range (4):  yield ibase = gen () for n in [1]: base = (Add (i, n) for I In base) Print List (base)

The output of this thing can be brain-mended, and the result is [20,21,22,23], not [10, 11, 12, 13]. At that time tangled half a day, has not understood, later Qi teacher a little pointing, suddenly want to understand-really stupid, alas. Well--just a little bit of a brief summary of the generator inside Python.

Iterators (iterator)

To say the generator, you must first say that the iterator
Distinguishing between Iterable,iterator and itertion
When it comes to iterators, there are several concepts that need to be differentiated: iterable,iterator,itertion, it looks like it's not the same. Here's a little bit of distinction.

Itertion: Iteration, one after another, is a generic concept, such as looping through an array.
Iterable: This is an iterative object, a python noun, a wide range, repeatable iterations, satisfying one of the following is iterable:
Can be for loop: for I in iterable
An object that can be indexed by index, which defines the __getitem__ method, such as LIST,STR;
Defines the __iter__ method. Can be returned at will.
You can call the object of ITER (obj) and return a iterator
Iterator: An iterator object, also belonging to a Python noun, can only be iterated once. You need to meet the following iterator protocols
Defines the __iter__ method, but must return itself
The next method is defined, and the python3.x is __next__. Used to return the next value, and when there is no data, throw stopiteration
Can maintain the current state
First STR and list are iterable but not iterator:

In [3]: s = ' Hi ' in [4]: S.__getitem__out[4]:
 
  In 
  
   [5]: S.next # There is no Next method---------------------------------------------------------------------------Attributeerror Traceback ( Most recent 
    
    
     -in ()----> 1 s.nextattributeerror: ' str ' object have no attribute ' next ' in [6 ]: L = [+] # empathy in [7]: L.__iter__out[7]: In 
     
      [8]: L.next---------------------------------------------------- -----------------------Attributeerror Traceback (most recent call last) 
      
       in 
       
        ()---- ; 1 l.nextattributeerror: ' List ' object has no attribute ' next ' in [9]: ITER (s) is S #iter () does not return itself OUT[9]: Falsein []: ITER (l) is L #同理Out [ten]: False
        
       
   
      
  
     
 
    

  
 

But for iterator is not the same as the following, the other iterable can support multiple iterations, and iterator after multiple next, the call will throw an exception, can only iterate once.

in [+]: si = iter (s) in [+]: siout[14]: in 
 
  
   
  []: si.__iter__ # with __ITER__OUT[15]: in 
  
   
    
   [+]: si.ne XT #拥有nextOut [+]: in 
   
    
     
    [+]: si.__iter__ () is Si #__iter__返回自己Out []: True
   
    
  
   
 
  

In this way, the differences between these concepts can be explained clearly in these few examples.

Custom iterator and data separation

In this case, the iterator object is basically out. The following is a general idea of how to make the object of a custom class an Iterator object, which is actually defining the __iter__ and Next methods:

In [1]:%pasteclass dataiter (object): Def __init__ (self, *args):  self.data = list (args)  self.ind = 0 def __iter__ ( Self): #返回自身  return to self def Next: # Return data  if Self.ind = = Len (self.data):   raise stopiteration  else:   data = Self.data[self.ind]   self.ind + = 1   return data##-End pasted text--in [9]: D = dataiter (+) in [10]: For x in D: # start iteration ....:  print x ....: 12In []: D.next () # can only iterate once, and will throw an exception if used again----------------------------------------- ----------------------------------stopiteration        Traceback (most recent)----> 1 d.next ()
 
  
    In
   next (self)  -  def Next (self):   if Self.ind = = Len (self.data):--->    raise Stopiteration   Else:    data = Self.data[self.ind]
 
  

From the next function can only fetch data forward, one can be seen at a time, but can not be repeated to fetch data, then this could solve it?

We know that iterator can only iterate once, but the Iterable object does not have this limitation, so we can separate the iterator from the data and define a iterable and iterator as follows:

Class Data (object): # Just iterable: can iterate over objects without iterator: iterator def __init__ (self, *args):  self.data = List (args) def __iter__ ( Self): # does not return itself returns  Dataiterator (Auto) class Dataiterator (object): # iterator: Iterator def __init__ (self, data):  Self.data = data.data  self.ind = 0 def __iter__ (self):  return self def next (self):  if Self.ind = = Len (self.dat A):   raise stopiteration  else:   data = Self.data[self.ind]   self.ind + = 1   return dataif __name__ = = ' __main__ ': D = Data (1, 2, 3) for x in D:  print X, for x in D:  print X,

The output is:

The
The
It can be seen that the data can be reused, because each time a dataiterator is returned, but the data can be used in this way, this implementation is common, such as the implementation of xrange is the form of this data and iterative separation, but very memory-saving, as follows:

In [8]: sys.getsizeof (Range (1000000)) out[8]: 8000072In [9]: sys.getsizeof (xrange (1000000)) out[9]: 40

There is also a small tip, which is why you can use the for iterator iterator object, because for us we do the next job and receive stopiteration processing.

The iterator is probably logged here, starting with a special, more elegant iterator: generator

Generator (Generator)

The first thing to be clear is that the generator is also a iterator iterator because it follows the iterator protocol.

Two ways of creating

Functions that contain yield

The generator function is only a little different from the normal function, which is to change the return to yield, where yield is a syntactic sugar that implements the iterator protocol internally, while keeping the state hanging. As follows:

Def gen (): print ' begin:generator ' i = 0 while True:  print ' Before return ', I  yield i  i + = 1  print ' afte R return ', IA = gen () in [ten]: a #只是返回一个对象Out [ten]: in 
 
  
   
  [all]: A.next () #开始执行begin: Generatorbefore return 0out[ [One]: 0In []: A.next () after return 1before return 1out[12]: 1
 
  

First see while True do not panic, it will only one of the execution ~
Look at the results to see something:

The call to Gen () does not have a real execution function, but simply returns a generator object
Executes the first a.next () before actually executing the function, executing to yield a return value, then suspending, maintaining the current namespace status. Then wait for the next call to continue execution from the next line of yield.
There is also a case where the generator function is executed, that is, when retrieving the elements of the generator, such as List (generator), it is only when the data is needed that it executes.

in [+]: def func (): ....:  print ' begin ' ....: for  I in range (4): ....:   yield iIn [+]: A = func () in [+]: Lis T (a) #检索数据, start execution beginout[17]: [0, 1, 2, 3]

Yield also has other advanced applications, followed by learning slowly.

Builder expression

The list generator is very handy: as below, ask for an odd number within 10:
[I for I in range (ten) if I% 2]

Also in Python 2.4 The generator expression is introduced, and the form is very similar, that is, [] replaced by ().

in [+]: a = (I for I in range (4)) in [Max]: aout[19]: 
 
  at
  
   
   0x7f40c2cfe410>in []: A.next () out[20]: 0
 
  

You can see that the generator expression creates a generator, and that there is a feature of lazy computation that is assigned only when it is retrieved.
Previous article: Python default parameter problems and an application, and finally an example:

def multipliers (): Return (Lambda x:i * x for I in range (4)) #修改成生成器print [M (2) to M in multipliers ()]

This means that only when I execute M (2), the for in the generator expression begins to loop from 0 and then I * x, so there is no problem in that article.

The lazy calculation is very useful, the above is an application, 2gua said:

Sex computing as a faucet, when needed to open, after the water off, when the data flow is paused, and then need to open the faucet, this time the data is still output, do not need to start from the beginning of the cycle
In fact, the essence of the same as the iterator, not one-time to the data to come over, when needed, just take.

Back to examples

See here, the beginning of the example should probably be a little clear, the core statement is:

For n in [1]: base = (Add (i, n) for I in base)

When the list (base) is executed, the retrieval begins, and then the generator begins the operation. The key is that the number of cycles is 2, that is, there are two times the generator expression process. This must be firmly grasped.

The generator returns to the start operation, n = 10 instead of 1, which is already mentioned in the above article, that is, add (i, n) binds the variable n, not its current value.

And then first the execution of the first generator expression: base = (10 + 0, 10 + 1, 10 + 2, 10 +3), this is the result of the first loop (image representation, actually already calculated (10,11,12,3)), then the second time, base = (10 + 10, 1 1 + 10, 12 + 10, 13 + 10), finally got results [20, 21, 22, 23].

The execution process can be manually seen on the pythontutor.

Summary

Summarized
This article mainly introduces the following points:

The concept of 1.iterable,iterator and itertion
2. Iterator protocol
Custom iterative objects are decoupled from iterators to ensure data reuse
3. Generator: A special iterator that implements the iterator protocol internally

In fact this piece, that several concepts to make clear, this is very key, understand the back on the inevitable. And there is a lot of depth to the previous knowledge.
For example, the common list is iterator and iteable separate implementation, itself is an iterative object, but not an iterator, similar to xrange, but different.
More and more understand, see the importance of the source. If there is a place to write, please correct me.

Reference

Http://www.shutupandship.com/2012/01/understanding-python-iterables-and.html
http://www.learningpython.com/2009/02/23/iterators-iterables-and-generators-oh-my/
Http://stackoverflow.com/questions/9884132/what-exactly-are-pythons-iterator-iterable-and-iteration-protocols
http://python.jobbole.com/81881/

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.