In understanding the data structure of Python, the container (container), the Iteration object (iterable), the iterator (iterator), the generator (generator), the list/collection/dictionary derivation (list,set,dict Comprehension) Many concepts are mixed together, it is inevitable for beginners to confused, I will use an article to try to put these concepts and their relationship between the clear.
Container (Container)
A container is a data structure that organizes multiple elements together, and the elements in a container can be iterated over and over, and keywords can be used in
to not in
determine whether an element is contained in a container. Usually such data structures store all the elements in memory (there are some exceptions, not all elements are placed in memory, such as iterators and generator objects) in Python, common container objects are:
- List, deque, ....
- Set, Frozensets, ....
- Dict, Defaultdict, Ordereddict, Counter, ....
- Tuple, Namedtuple, ...
- Str
The container is easier to understand because you can think of it as a box, a house, a cupboard, which can be stuffed with anything. Technically, when it can be used to ask if an element is contained in it, the object can be thought of as a container, such as a list,set,tuples is a container object:
>>> assert 1 in [1, 2, 3] # lists>>> assert 4 not in [1, 2, 3]>>> assert 1 in {1, 2, 3}
# sets>>> assert 4 not in {1, 2, 3}>>> asserts 1 in (1, 2, 3) # Tuples>>> assert 4 not in ( 1, 2, 3)
Ask if an element is in dict with the Dict key:
>>> d = {1: ' foo ', 2: ' Bar ', 3: ' Qux '}>>> assert 1 in d>>> assert ' foo ' isn't in D # ' Foo ' is not Elements in the Dict
Ask if a substring is in string:
Although the vast majority of containers provide some way to get each of these elements, this is not the ability provided by the container itself, but the ability to iterate over objects to the container, not all of which are iterative, for example: Bloom filter, although Bloom Filter can be used to detect whether an element is contained in a container, but it is not possible to get each of these values in a single device, because Bloom Filter does not store the element in the container at all, but instead maps it to a value in the array by a hash function.
An iterative object (iterable)
As I said earlier, many containers are iterative objects, and there are more objects that are also iterative objects, such as Files,sockets in the open state, and so on. But the object that can return an iterator can be called an iterative object, which may sound a bit confusing, but let's look at an example:
>>> x = [1, 2, 3]>>> y = iter (x) >>> z = iter (x) >>> next (y) 1>>> next (y) 2>&G T;> Next (z) 1>>> type (x) <class ' list ' >>>> type (y) <class ' list_iterator ' >
Here x
is an iterative object, can iterate objects and containers is a popular term, do not refer to a specific data type, list is an iterative object, Dict is an iterative object, set is also an iterative object. y
and z
is two independent iterators that hold a state inside the iterator that records the position of the current iteration to facilitate the next iteration to get the correct element. Iterators have a specific type of iterator, for example list_iterator
, set_iterator
. An iterative object implements the __iter__
method that returns an Iterator object.
When running code:
x = [1, 2, 3]for elem in x: ...
The actual performance is:
To decompile the code, you can see that the interpreter is calling the GET_ITER
instruction, which is the equivalent of calling iter(x)
, and the FOR_ITER
instruction is to call the next()
method and constantly get the next element in the iterator, but you can't read it directly from the command because he is optimized by the interpreter.
>>> import Dis>>> x = [1, 2, 3]>>> Dis.dis (' for _ in X:pass ') 1 0 setup_loop (t O) 3 load_name 0 (x) 6 get_iter >> 7 For_iter 6 (to + ) store_name 1 ( _) Jump_absolute 7 >> pop_block >> load_const 0 (None) Return_value
Iterators (iterator)
So what about iterators? It is a stateful object that can return the next value in the next()
container when you invoke the method, and any object that implements the __iter__
and __next__()
(implemented in Python2 next()
) method is an iterator that __iter__
returns the iterator itself, __next__
returning the next value in the container , it is not important to throw stopiteration exceptions if there are no more elements in the container, as to how they are implemented.
So, the iterator is the object that implements the factory pattern, and it returns you every time you ask for the next value. There are many examples of iterators, such as itertools
what the function returns are iterator objects.
Generate an infinite sequence:
>>> from itertools import count>>> counter = count (start=13) >>> Next (counter) 13>>> Next (counter) 14
To generate an infinite sequence from a finite sequence:
>>> from itertools import cycle>>> colors = cycle ([' Red ', ' white ', ' Blue ']) >>> next (colors) ' Red ' >>> Next (colors) ' White ' >>> Next (colors) ' Blue ' >>> Next (colors) ' red '
To generate a finite sequence from an infinite sequence:
>>> from itertools import islice>>> colors = cycle ([' Red ', ' white ', ' Blue ') # infinite>> > Limited = islice (colors, 0, 4) # finite>>> for X in limited: ... Print (x) redwhitebluered
To get a more intuitive sense of the execution process inside an iterator, we customize an iterator that takes the Fibonacci sequence as an example:
Class Fib: def __init__ (self): self.prev = 0 Self.curr = 1 def __iter__ (self): return self def __next__ (self): value = Self.curr Self.curr + = Self.prev Self.prev = value return value>> > f = Fib () >>> list (Islice (f, 0, 10)) [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
The FIB is both an iterative object (because it implements the __iter__
method) and an iterator (because it implements the __next__
method). Instance variables prev
and curr
the state within the user's maintenance iterator. next()
do two things each time the method is called:
- Modify state for Next call
next()
method
- Generates a return result for the current call
An iterator is like a lazy-loaded factory that returns a value when someone needs it, and waits for the next call when it is not called.
Generator (Generator)
The builder is one of the most attractive features of the Python language, and the generator is a special kind of iterator, but it's more elegant. It doesn't need to be written and done like the class above __iter__()
__next__()
, just a yiled
keyword. The generator must be an iterator (and vice versa), so any generator also generates values in a lazy-loaded pattern. Examples of Fibonacci sequences implemented with generators are:
def fib (): prev, Curr = 0, 1 while True: yield curr prev, Curr = Curr, Curr + prev>>> f = fib () &G T;>> list (islice (f, 0, 10)) [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
fib
is a common Python function, and its special place is that there is no keyword in the function body return
, and the return value of the function is a generator object. When execution f=fib()
returns a generator object, the code in the body of the function does not execute, and the code is actually executed only when the next is shown or implicitly called.
The builder is a very powerful programming structure in Python that can be used to write streaming code with less intermediate variables, and it can save memory and CPU compared to other container objects, although it can do similar functions with less code. Now it's time to refactor your code and see something like this:
def something (): result = [] for ...: result.append (x) return result
Can be replaced with a generator function:
Def iter_something (): For ... in ... : yield x
Generator Expressions (Generator expression)
A builder expression is a list-down version of the builder that looks like a list derivation, but it returns a generator object instead of a list object.
>>> a = (x*x for x in range) >>> A<generator object <genexpr> at 0x401f08>>>> su M (a) 285
Summarize
- A container is a collection of elements, and STR, list, set, Dict, file, sockets objects can all be considered containers, and containers can be iterated (used in statements such as For,while), so they are referred to as iterative objects.
- An iterative object implements the
__iter__
method that returns an Iterator object.
- The iterator holds an internal state field that records the next iteration return value, implements the
__next__
sum method, and the __iter__
iterator does not load all the elements into memory at once, but produces the returned results when needed.
- The generator is a special iterator whose return value is not passed through
return
but is used yield
.
Original link: https://foofish.net/iterators-vs-generators.html
English Link: https://nvie.com/posts/iterators-vs-generators/
Reference Link: https://docs.python.org/2/library/stdtypes.html#iterator-types
Fully understand Python iteration objects, iterators, generators