Both the iterator and generator are unique concepts in Python. The iterator can be considered as a special object. Each time you call this object, it will return its next element. In terms of implementation, an iteratable object must be an object that defines the _ iter _ () method, and an iterator must define the _ iter _ () method and next () method () method object.
Example
Old rule: first run the following code:
def add(s, x): return s + xdef gen(): for i in range(4): yield ibase = gen()for n in [1, 10]: base = (add(i, n) for i in base)print list(base)
The output of this item can be supplemented. The result is [20, 21,] instead of [10, 11, 12, 13]. At that time, I had been entangled for a long time and had never understood it. Later, I gave some advice from teacher QI. I suddenly wanted to understand it. It was really stupid, alas .. Okay -- let's take a look at the generator in python.
Iterator)
To describe the generator, you must first describe the iterator.
Distinguish iterable, iterator from itertion
When it comes to the iterator, there are several different concepts: iterable, iterator, and itertion. They are similar to each other, but they are not. The following is a distinction.
Itertion: iteration. one after another is a general concept, such as looping through an array.
Iterable: This is an iteratable object. It is a python term and has a wide range of repeated iterations. iterable is one of the following:
For Loop: for I in iterable
Objects that can be indexed by index, that is, the _ getitem _ method is defined, such as list and str;
The _ iter _ method is defined. You can return it at will.
You can call the iter (obj) object and return an iterator.
Iterator: An iterator object. It is also a python term and can only be iterated once. The following iterator protocols must be met:
The _ iter _ method is defined, but it must return itself.
The next method is defined. In python3.x, It is _ next __. Returns the next value, and throws StopIteration when no data exists.
The current status can be maintained.
First, str and list are iterable but not iterator:
In [3]: s = 'Hi' In [4]: s. _ getitem _ Out [4]:
In [5]: s. next # No next method ----------------------------------------------------------------------------- AttributeError Traceback (most recent call last)
In
() ----> 1 s. nextAttributeError: 'str' object has no attribute 'Next' In [6]: l = [1, 2] # likewise, In [7]: l. _ iter _ Out [7]:
In [8]: l. next ------------------------------------------------------------------------------- AttributeError Traceback (most recent call last)
In
() ----> 1 l. nextAttributeError: 'LIST' object has no attribute 'Next' In [9]: iter (s) is s # iter () does not return itself Out [9]: FalseIn [10]: iter (l) is l # Similarly, Out [10]: False
However, iterator is different from the following. In addition, iterable supports multiple iterations. After multiple next calls, iterator throws an exception and can only iterate once.
In [13]: si = iter (s) In [14]: siOut [14]:
In [15]: si. _ iter _ # _ iter _ Out [15]:
In [16]: si. next # has nextOut [16]:
In [20]: si. _ iter _ () is si #__ iter _ return oneself Out [20]: True
In this way, the differences between these concepts can be explained through these examples.
Custom iterator and data separation
Speaking of this, the iterator object has basically come out. The following describes how to make the object of a custom class an iterator object, namely, defining the _ iter _ and next methods:
In [1]: % pasteclass DataIter (object): def _ init _ (self, * args): self. data = list (args) self. ind = 0 def _ iter _ (self): # return itself return self def next (self): # return data if self. ind = len (self. data): raise StopIteration else: data = self. data [self. ind] self. ind + = 1 return data # -- End pasted text -- In [9]: d = DataIter (1, 2) In [10]: for x in d: # Start Iteration ....: print x ....: 12In [13]: d. next () # It can only be iterated once. If it is used again, an exception will be thrown ------------------------------------------------------------- StopIteration Traceback (most recent call last) ----> 1. next ()
In next (self) 10 def next (self): 11 if self. ind = len (self. data): ---> 12 raise StopIteration 13 else: 14 data = self. data [self. ind]
From the next function, we can only retrieve data from the forward. One can be seen at a time, but data cannot be retrieved repeatedly. Can this be solved?
We know that iterator can only be iterated once, but the iterable object does not have this restriction. Therefore, we can separate iterator from the data and define an iterable and iterator respectively as follows:
Class Data (object): # Only iterable: iteratable object, not iterator: iterator def _ init _ (self, * args): self. data = list (args) def _ iter _ (self): # The return DataIterator (self) class DataIterator (object) is not returned: # iterator: iterator def _ init _ (self, data): self. data = data. data self. ind = 0 def _ iter _ (self): return self def next (self): if self. ind = len (self. data): raise StopIteration else: data = self. data [self. ind] self. ind + = 1 return dataif _ name _ = '_ main _': d = Data (1, 2, 3) for x in d: print x, for x in d: print x,
The output is:
1, 2, 3
1, 2, 3
It can be seen that the data can be reused, because a DataIterator is returned each time, but the data can be used in this way. This implementation method is very common. For example, the implementation of xrange is the form of separation between data and iteration, but it saves a lot of memory, as shown below:
In [8]: sys.getsizeof(range(1000000))Out[8]: 8000072In [9]: sys.getsizeof(xrange(1000000))Out[9]: 40
In addition, there is a small tips, which is why the for iterator object can be used, because for performs next for us and receives the processing of StopIteration.
The iterator is probably recorded here. Next we start a special and more elegant iterator: Generator
Generator)
The first thing to note is that the generator is also an iterator because it complies with the iterator protocol.
Two creation methods
Functions that contain yield
There is only one difference between a generator function and a common function, that is, to replace return with yield. yield is a syntactic sugar, implements the iterator protocol internally, and can be suspended at the same time. As follows:
Def gen (): print 'begin: generator' I = 0 while True: print 'before return ', I yield I + = 1 print 'after return ', ia = gen () In [10]: a # Only returns an object Out [10]:
In [11]:. next () # Start to execute begin: generatorbefore return 0Out [11]: 0In [12]:. next () after return 1 before return 1Out [12]: 1
First, we can see that while True does not need to be alarmed. It will only be executed one by one ~
The results show that:
Calling gen () does not actually execute the function, but returns a generator object.
When a. next () is executed for the first time, the function is actually executed. When yield returns a value, the function will be suspended to maintain the current namespace status. Then wait for the next call and continue from the next line of yield.
In another case, the generator function will be executed, that is, when the element of the generator is retrieved, such as list (generator). To put it bluntly, it will be executed only when data is needed.
In [15]: def func ():....: print 'begin '....: for I in range (4 ):....: yield iIn [16]: a = func () In [17]: list (a) # retrieve data and start beginOut [17]: [0, 1, 2, 3]
Yield also has other advanced applications that will be learned later.
Generator expression
The list builder is very convenient: Calculate the odd number within 10 as follows:
[I for I in range (10) if I % 2]
The generator expression is also introduced in python 2.4, and the form is very similar. It is to replace [] ().
In [18]: a = ( i for i in range(4))In [19]: aOut[19]:
at 0x7f40c2cfe410>In [20]: a.next()Out[20]: 0
It can be seen that the generator expression creates a generator, which has a feature of inert computing. It is assigned a value only when it is retrieved.
There was an article about python default parameters and an application. The last example is as follows:
Def multipliers (): return (lambda x: I * x for I in range (4) # change to generator print [m (2) for m in multipliers ()]
This means that, only when m (2) is executed, the for in the generator expression starts from the 0 loop, and then the I * x, so there is no problem in that article.
This feature of inert computing is very useful. The above is an application, as 2gua says:
Sexual computing can be imagined as a faucet. It can be turned on when needed, and the water is switched off. At this time, the data stream is paused, and then the faucet is turned on when needed. At this time, the data is still output, you do not need to start from scratch
In fact, the essence is similar to that of the iterator. It does not take all the data at a time. It is needed only when necessary.
Back to Example
Here we can see that the initial example is probably clear. The core statement is:
for n in [1, 10]: base = (add(i, n) for i in base)
When list (base) is executed, the search is started, and the generator starts the operation. The key is that the number of cycles is 2, that is, there are two generator expressions. This must be firmly grasped.
The generator returns to start the operation. n = 10 instead of 1 is okay. As mentioned in the above article, add (I, n) is bound to the n variable, instead of its current value.
The first step is the execution process of the first generator expression: base = (10 + 0, 10 + 1, 10 + 2, 10 + 3). This is the result of the first loop (image representation, actually, it has been calculated (10, 11, 12, 3), and the second time, base = (10 + 10, 11 + 10, 12 + 10, 13 + 10 ), finally, the result is [20, 21, 22, 23].
You can manually view the execution process on pythontutor.
Summary
Summary
It mainly introduces the following points:
1. Concepts of iterable, iterator and itertion
2. iterator Protocol
The custom iteratable object is separated from the iterator to ensure data reuse.
3. Generator: A special iterator that implements the iterator protocol internally
In fact, the several concepts are clearly understood. This is very important. After you understand it, it will be appropriate. In addition, I have deepened my previous knowledge.
For example, a common list is implemented by separating iterator from iteable. It is an Iterated object, but not an iterator. It is similar to xrange but different.
The importance of source code is becoming increasingly apparent. It is not suitable to write somewhere. Please correct it.
Reference
Http://www.shutupandship.com/2012/01/understanding-python-iterables-and.html
Http://www.learningpython.com/2009/02/23/iterators-iterables-and-generators-oh-my/
Http://stackoverflow.com/questions/9884132/what-exactly-are-pythons-iterator-iterable-and-iteration-protocols
Http://python.jobbole.com/81881/