Python-based iterators and generators

Source: Internet
Author: User
Tags function definition generator generator iterable

What if I now have a list of l=[' A ', ' B ', ' C ', ' d ', ' e ', and I want to fetch the contents of the list in several ways?

First, I can index the value of l[0], and then we can also use for the loop to take the value ah?

Have you ever thought about it, there is a subtle difference between using the index value and the for loop to take the value.

If you use an index to take a value, you can take the value anywhere, if you want to know where the value is.

If we use a for loop to get the value, we take each value, do not need to care about the location of each value, because only the order of the value, and can not skip any one directly to fetch the value of another location.

But have you ever wondered why we can use a for loop to take a value?

How does the For loop work inside?

For loops in Python

To understand what's going on for the for Loop, let's start from the code point of view.

First, we make a for loop for a list.

For i in [1,2,3,4]:      print (i)

There's no problem with the code above, but let's change the case to loop a number 1234 try

For i in 1234    print (i) Results: Traceback (most recent call last):  File "test.py", line 4, in <module> for    i I n 1234:typeerror: ' int ' object is not iterable

Look, an error has been made! What's wrong with the report? "TypeError: ' int ' object is not iterable", say int type not a iterable, what is this iterable?

If you do not know what is iterable, we can turn over the dictionary, first of all to get a Chinese explanation, although the translation came over you may not know, but it's okay, I will take you step-by-step analysis.

Iterative and iterative protocols

What do you mean by iteration

Now that we've got a new clue, there's a concept called "iterative" .

First, we analyze from an error, as if the reason why 1234 is not for the loop, because it is not iterative. Then if "iterative", it should be able to be a for loop.

We know that strings, lists, tuples, dictionaries , and collections can all be used for loops, indicating that they are all iterative .

How can we prove it?

From collections import iterable                             L = [1,2,3,4]                t = (1,2,3,4)                d = {1:2,3:4}                s = {1,2,3,4}                                             print (isinst Ance (l,iterable)) print (Isinstance (t,iterable)) print (Isinstance (d,iterable)) print (Isinstance (s,iterable))

In combination with the phenomenon of using the For loop to take a value, and then literally, the iteration is what we just said, that the data in a dataset can be taken out "one next to another," called an iteration .

An iterative protocol

We are now analyzing the causes from the results, can be used for the loop is "iterative", but if you are thinking, for how to know who is an iterative?

If we write a data type ourselves and hope that something in this data type can be taken out with a for by one, then we have to satisfy the for requirement. This requirement is called " Agreement ".

a requirement that can be iterated over is called an iterative protocol. the definition of an iterative protocol is very simple, which is to implement the __iter__ method internally.

Now let's verify that:

Print (dir ([])) Print (dir ({1:2})) Print (dir ({2,3}))

[' __add__ ', ' __class__ ', ' __contains__ ', ' __delattr__ ', ' __delitem__ ', ' __dir__ ', ' __doc__ ', ' __eq__ ', ' __format__ ', ' _ _ge__ ', ' __getattribute__ ', ' __getitem__ ', ' __gt__ ', ' __hash__ ', ' __iadd__ ', ' __imul__ ', ' __init__ ', ' __iter__ ', ' __le_ _ ', ' __len__ ', ' __lt__ ', ' __mul__ ', ' __ne__ ', ' __new__ ', ' __reduce__ ', ' __reduce_ex__ ', ' __repr__ ', ' __reversed__ ', ' __ Rmul__ ', ' __setattr__ ', ' __setitem__ ', ' __sizeof__ ', ' __str__ ', ' __subclasshook__ ', ' append ', ' clear ', ' copy ', ' Count ', ' Extend ', ' index ', ' Insert ', ' pop ', ' remove ', ' reverse ', ' sort ' [' __add__ ', ' __class__ ', ' __contains__ ', ' __delattr__ ', ' __dir__ ', ' __doc__ ', ' __eq__ ', ' __format__ ', ' __ge__ ', ' __getattribute__ ', ' __getitem__ ', ' __getnewargs__ ', ' __gt__ ', ' __hash__ ', ' __init__ ', ' __iter__ ', ' __le__ ', ' __len__ ', ' __lt__ ', ' __mul__ ', ' __ne__ ', ' __new__ ', ' __reduce__ ', ' __ Reduce_ex__ ', ' __repr__ ', ' __rmul__ ', ' __setattr__ ', ' __sizeof__ ', ' __str__ ', ' __subclasshook__ ', ' count ', ' index ' [' _ _class__ ', ' __contains__ ', ' __delattr__ ', ' __delitEm__ ', ' __dir__ ', ' __doc__ ', ' __eq__ ', ' __format__ ', ' __ge__ ', ' __getattribute__ ', ' __getitem__ ', ' __gt__ ', ' __hash__ ', ' __init__ ', ' __iter__ ', ' __le__ ', ' __len__ ', ' __lt__ ', ' __ne__ ', ' __new__ ', ' __reduce__ ', ' __reduce_ex__ ', ' __repr__ ', ' __setattr__ ', ' __setitem__ ', ' __sizeof__ ', ' __str__ ', ' __subclasshook__ ', ' clear ', ' copy ', ' Fromkeys ', ' get ', ' items ', ' Keys ', ' Pop ', ' Popitem ', ' setdefault ', ' Update ', ' values ' [' __and__ ', ' __class__ ', ' __contains__ ', ' __delattr__ ', ' __ Dir__ ', ' __doc__ ', ' __eq__ ', ' __format__ ', ' __ge__ ', ' __getattribute__ ', ' __gt__ ', ' __hash__ ', ' __iand__ ', ' __init__ ', ' __ior__ ', ' __isub__ ', ' __iter__ ', ' __ixor__ ', ' __le__ ', ' __len__ ', ' __lt__ ', ' __ne__ ', ' __new__ ', ' __or__ ', ' __rand__ ', ' __reduce__ ', ' __reduce_ex__ ', ' __repr__ ', ' __ror__ ', ' __rsub__ ', ' __rxor__ ', ' __setattr__ ', ' __sizeof__ ', ' __str__ ', ' _ _sub__ ', ' __subclasshook__ ', ' __xor__ ', ' Add ', ' clear ', ' copy ', ' Difference ', ' difference_update ', ' Discard ', ' Intersection ', ' intersection_update ', ' isdisjoint ',' Issubset ', ' issuperset ', ' Pop ', ' Remove ', ' symmetric_difference ', ' symmetric_difference_update ', ' Union ', ' Update '] 

To summarize what we now know: The For loop can be iterated, and to be iterative, there must be a __iter__ method inside.

And then analyze, what did the __iter__ method do?

Print ([1,2].__iter__ ()) Result <list_iterator object at 0x1024784a8>

Executing the __iter__ method of list ([+]), we seem to have got a list_iterator, and now we have a term--iterator.

Iterator, this is a special noun in a computer called an iterator.  

Iterator protocol

What is called "iteration" after another historical new problem, what is called "iterator"?

Although we don't know what an iterator is, we now have an iterator, which is a list of iterators.

Let's take a look at the list of iterators that are new to the list, so you can uncover the mystery of the iterator.

"Dir ([1,2].__iter__ ()) is all the methods implemented in the list iterator, and Dir ([]) is all the methods implemented in the list, all returned to us in the form of a list, in order to see more clearly, we convert them into sets,
Then take the difference set. #print (dir ([1,2].__iter__ ())) #print (dir ([])) Print (Set (dir ([1,2].__iter__ ()))-set (dir ([]))) Result: {' __ Length_hint__ ', ' __next__ ', ' __setstate__ '}

We see that there are three more methods in the list iterator, so what are the three ways to do it each other?

iter_l = [1,2,3,4,5,6].__iter__ () #获取迭代器中元素的长度print (iter_l.__length_hint__ ()) #根据索引值指定从哪里开始迭代print (' * ', iter_l.__ SETSTATE__ (4)) #一个一个的取值print (' * * ', iter_l.__next__ ()) Print (' * * * ', iter_l.__next__ ())

Of these three methods, who is the magic way to give us a value?

That's right! It's __next__.

In the For loop, the __next__ method is called internally to fetch a value of one.

Then we'll use the next method of the iterator to write a traversal that doesn't depend on for.

L = [1,2,3,4]l_iter = l.__iter__ () item = l_iter.__next__ () print (item) item = L_ITER.__NEXT__ () print (item) item = l_iter.__ Next__ () print (item) item = L_ITER.__NEXT__ () print (item) item = L_ITER.__NEXT__ () print (item)

This is an error code, if we always take next to the iterator has no elements, we will throw an exception stopiteration, tell us that there is no valid element in the list.

At this point, we're going to use the exception handling mechanism to get rid of this exception.

L = [1,2,3,4]l_iter = l.__iter__ () while True:    try:        item = l_iter.__next__ ()        print (item)    except Stopiteration: Break        

So now we're going to use the while loop to do what the original for loop does, and who do we get a value from? Is that l_iter? Well, this l_iter is an iterator.

Iterators follow an iterator protocol: you must have the __iter__ method and the __next__ method.

Debts: Next and Iter methods

As a result, we've repaid two of the iterators and generators, and finally we'll look at the range (). First, it must be an iterative object, but is it an iterator? Let's test it.

Print (' __next__ ' in Dir (range))  #查看 ' __next__ ' is not internally __next__print after the range () method is executed (' __iter__ ' in Dir (range ( ))  #查看 ' __next__ ' is not internally __next__from collections import Iteratorprint (Isinstance (range) after the range () method is executed 100000000) (Iterator))  #验证range执行之后得到的结果不是一个迭代器

Why do you have A for loop

Based on the above-mentioned list of this large pile of traversal, smart you immediately see apart from the clues, so you naively loudly shouted, you do not tease me to play it, with the subscript of access, I can traverse a list this way

L=[1,2,3]index=0while Index < Len (l):    print (L[index])    index+=1# to yarn for loop, to be yarn iterative, to yarn iterator

Yes, sequence type strings, lists, tuples have subscripts, you access them in the above way, perfect! But have you ever thought about the feeling of a non-sequential type like a dictionary, a collection, a file object, so, young man, the For loop is based on an iterator protocol that provides a uniform way to traverse all objects, that is, before the traversal, the __iter__ method of the object is called to convert it to an iterator. Then use the iterator protocol to iterate, so that all the objects can be traversed by the For loop, and you see the effect is the same, this is omnipotent for loop, enlightenment, young man

First Knowledge generator

There are two types of iterators we know: One is to return the method directly, and one is to have an iterative object obtained by executing the ITER method, which has the advantage of saving memory.

If in some cases, we also need to save memory, we can only write ourselves. What we write ourselves is called the generator, which implements the function of the iterator.

The generators available in Python:

1. Generator function: General function definition, however, returns the result using the yield statement instead of the return statement. The yield statement returns one result at a time, in the middle of each result, suspends the state of the function so that the next time it leaves, it resumes execution.

2. Generator expression: Similar to list derivation, however, the generator returns an object that produces results on demand, rather than building a list of results at a time

Generator generator:

Essence: iterators (so we have the __iter__ method and the __next__ method, we do not need to implement)

Features: Lazy operation, developer customization

Generator functions

A function that contains the yield keyword is a generator function. Yield can return a value from a function, but yield is different from Return,return's execution means the end of the program, the Call builder function does not get the specific value returned, but instead gets an iterative object. Each time you get the value of this iterative object, you can push the execution of the function to get a new return value. Until the function execution is finished.

Import Timedef genrator_fun1 ():    a = 1    print (' Now defines a variable ')    yield a    b = 2    print (' Now also defines B variable ')    Yield bg1 = genrator_fun1 () print (' G1: ', G1)       #打印g1可以发现g1就是一个生成器print ('-' *20 ')   #我是华丽的分割线print (Next (G1)) Time.sleep (1)   #sleep一秒看清执行过程print (Next (G1))

What are the benefits of generators? is not to generate too much data in memory all at once.

If I want to let the factory give students to do school uniforms, production of 2 million clothes, I and the factory said, the factory should be the first to promise to come down, and then go to production, I can one piece of it, can also be based on a batch of students to find factories to take.
And can not be a said to produce 2 million pieces of clothing, the factory first to do the production of 2 million pieces of clothing, and so back to do, the students have graduated ...

#初识生成器二def Produce ():    "" "Production Clothes" "    for I in Range (2000000):        yield" produced the first%s dress "%iproduct_g = Produce () print ( Product_g.__next__ ()) #要一件衣服print (product_g.__next__ ()) #再要一件衣服print (product_g.__next__ ()) #再要一件衣服num = 0for I in Product_g:         #要一批衣服, e.g. 5-piece    print (i)    num +=1    If num = = 5:        break# Here we got 8 clothes from the factory, and I made my production function altogether ( That is, the produce generator function) produces 2 million pieces of clothing. #剩下的还有很多衣服, we can take it all the time, or we can keep it when we want it.

More applications

Import timedef tail (filename):    f = open (filename)    f.seek (0, 2) #从文件末尾算起 while    True: line        = F.readline ()  # reads the new lines of text in the file if not line        :            time.sleep (0.1)            continue        yield linetail_g = tail (' TMP ') for line in Tail_g:    print (line)
Def averager (): Total    = 0.0    count = 0    average = None and    True: Term        = yield average total        + = ter M        count + = 1        average = Total/countg_avg = Averager () next (g_avg) print (G_avg.send ()) Print (G_avg.send (30)) Print (G_avg.send (5))
Def init (func):  #在调用被装饰生成器函数的时候首先用next激活生成器    def inner (*args,**kwargs):        g = func (*args,**kwargs)        Next (g)        return G    return inner@initdef Averager (): Total    = 0.0    count = 0    average = none< C19/>while True: Term        = yield average total + = term        count + = 1        average = Total/countg_avg = Averager () # Next (G_AVG)   executes the next method in the adorner print (G_avg.send ()) print (G_avg.send ()) Print (G_avg.send (5))

Yield from

Def gen1 (): For    C in ' AB ':        yield C for    I in range (3):        yield iprint (list (Gen1 ())) def Gen2 ():    yield fro M ' AB '    yield from range (3) Print (List (Gen2 ()))

List derivation and generator expressions

#老男孩由于峰哥的强势加盟很快走上了上市之路, Alex Reasoning decided to pay the next few eggs to repay egg_list=[' egg%s '%i for I in Range (Ten)] #列表解析 # Brother looked at Alex under a basket of eggs, cover the nose, said: elder brother, You still give me a hen, I'll go home myself. laomuji= (' egg%s '%i for I in range) #生成器表达式print (Laomuji) print (Next (Laomuji)) #next本质就是调用__next__ Print (laomuji.__next__ ()) print (Next (Laomuji))

Summarize:

1. The [] Conversion of the list parsing [] to () is the generator expression

2. List parsing and builder expressions are a convenient way of programming, except that generator expressions are more memory-efficient

3.Python not only uses the iterator protocol, but makes the for loop more general. Most built-in functions also access objects using an iterator protocol. For example, the SUM function is a python built-in function that accesses an object using an iterator protocol, and the generator implements an iterator protocol, so we can calculate the sum of a series of values directly:

SUM (x * * 2 for X in range (4))

Instead of superfluous, construct a list first:

Summary of this chapter

Objects that can be iterated:

Have the __iter__ method

Features: Lazy operation

Example: Range (), Str,list,tuple,dict,set

Iterator iterator:

Have __iter__ methods and __next__ methods

For example: ITER (Range ()), ITER (str), ITER (list), ITER (tuple), ITER (Dict), ITER (set), Reversed (list_o), Map (Func,list_o), Filter (func,list_o), file_o

Generator generator:

Essence: iterators, so have __iter__ methods and __next__ methods

Features: Lazy operation, developer customization

Advantages of using Generators:

1. Delay the calculation and return one result at a time. That is, it does not produce all the results at once, which is useful for large data processing.

#列表解析sum ([i-I in range (100000000)]) #内存占用大, the machine is prone to die #生成器表达式sum (I for I in range (100000000)) #几乎不占内存

2. Improve code Readability

Generator-related face questions

The builder has a lot to do with programming, and making the best use of generators can help us solve many complex problems.

In addition, the generator is also the focus of the interview problem, in addition to the completion of some functions, people have come up with a lot of magic face questions.
And then we'll have a look.

Def demo (): For    I in range (4):        yield Ig=demo () g1= (i-I in G) g2= (I-I in G1) print (list (G1)) Print (list (G2))
def add (n,i):    return N+idef test (): For    I in range (4):        yield ig=test () for n in [1,10]:    g= (Add (n,i) for I n g) Print (List (g))
Import osdef init (func):    def wrapper (*args,**kwargs):        g=func (*args,**kwargs)        next (g)        return G    return wrapper@initdef list_files (target): While    1:        Dir_to_search=yield for        top_dir,dir,files in Os.walk (Dir_to_search): For            file in Files:                target.send (Os.path.join (top_dir,file)) @initdef opener (target):    While 1:        file=yield        fn=open (file)        target.send ((FILE,FN)) @initdef Cat (target): While    1:        File,fn=yield for line in        fn:            target.send ((file,line)) @initdef grep (pattern,target): While    1:        File,line=yield        if pattern in line:            target.send (file) @initdef printer (): While    1:        File=yield        if file:            print (file) G=list_files (Opener (Cat (grep (' Python ', Printer ()))) G.send ('/test1 ') application: GREP-RL/ Dir

Python-based iterators and generators

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.