Examples of iterators, generators, and list parsing usages in Python

Source: Internet
Author: User
iterators: A preliminary study

As mentioned in the previous chapter, a for loop is actually available on any object that can be iterated. In fact, this is true for all of the iterative tools in Python that scan objects from left to right, including for loops, list parsing, in-member relationship testing, and map built-in functions.

The concept of an "iterative object" is quite novel in Python, basically this is the generalization of the concept of sequence: If an object is actually saved in a sequence, or can iterate over an object that produces a result once in the tool environment, it is considered to be iterative.

>> file iterators
A file that is a built-in data type is also iterative, and it has a method named __next__, which returns the next line in the file each time it is called. When the end of the file is reached, __next__ throws a built-in Stopiteration exception instead of returning an empty string.

This interface is the so-called iterative protocol in Python: an object with a __next__ method advances to the next result, and at the end of a series of results, it throws a stopiteration. Any such object is considered to be iterative. Any such object can also be traversed with a for loop or other iteration tool because all iteration tools work internally to call __next__ in each iteration and catch Stopiteratin exceptions to determine when to leave.

For line in open (' script.py '): Print (Line.upper (), end= ")

The above code is an example of a file iteration, and this usage is the most efficient way to read a file, with three main advantages: This is the simplest notation, runs fast, and is best in terms of memory usage.

The alternative notation is:

For line in open (' script.py '). ReadLines (): Print (Line.upper (), end= ")

This method of invocation will read the file into memory at once, and if the file is too large, the memory will be consumed.

>> manual iterations: ITER and Next
To support manual iteration of code (with fewer entries), Python3.0 also provides a built-in function next, which automatically invokes an object's __next__ method. Given an object X, calling next (x) is equivalent to x.__next__ (), but the former is much simpler.

From a technical point of view, the iterative protocol is also worth noting. When the For loop is started, it is passed to the ITER built-in function to get an iterator from an iterative object that contains the required next method. The steps to call ITER are not necessary for a file because the file object is its own iterator, but not necessarily for some other built-in data types.

Lists, as well as many other built-in objects, are not their own iterators, because they support opening iterators multiple times. For such an object, we must call ITER to start the iteration:

L=[1,2,3]iter (L) is L #return falsel.__next__ () #会报错I = iter (l) i.__next__ () i.__next__ ()

Although the Python iterative tool automatically calls these (iter,__next__) functions, we can also use them to manually apply the iteration protocol.

List parsing: A preliminary study

Basic knowledge of >> list parsing

L=[1,2,3,4,5]l = [x+10 for x in L]

List parsing is written in a square bracket, because they are ultimately a way to build a new list. They begin with an arbitrary expression of our composition, which uses one of our cyclic variables (x+10). And then we should now look at the part of a for loop header that declares the loop variable and an iterative object (for x in L)

To run the expression, Python executes an iteration through L within the interpreter, assigns x to each element in order, and collects the result of the expression on the left side of each element's run. The list of results we get is what the list resolves--for each X in L, a new list of x+10 is included.

In fact, the list parsing is not necessary because the work that it can accomplish can be done through a for loop, but the list resolution runs faster (often faster) than the manual for loop statement because their iterations are executed at the speed of the interpreter within the C language, Instead of being executed in manual Python code, especially for large data collections, this is a major performance benefit of using list parsing.

You can consider using list parsing when we consider performing an operation on each item in a sequence.

>> list parsing syntax for extensions
In fact, list parsing can have more advanced applications. As a particularly useful extension, nested for loops in an expression can have a related if clause to filter the result items whose tests are not true.

lines = [Line.rstrip () for line in open (' script.py ') if line[0]= ' P ']

This if clause checks each line read from the file to see if its first character is P, and if not, the row is omitted from the list of results.

In fact, if we want, list parsing can become more complex-their complete syntax allows any number of for clauses, and each clause has an optional related if clause.

New Iterative objects in Python3.0

A fundamental change in Pyton3.0 is that it emphasizes iterations more than python2.x. In addition to iterations related to built-in types such as files and dictionaries, the dictionary method keys, values, and items return an iterator object in Python3.0. The benefit of returning an iterative object instead of returning a list of results is to save memory space.

>> multiple iterators vs single iterators
Multiple iterators: Multiple iterators that can hold different positions in their results
Single iterator: Only one iterator can be persisted, and after traversing its results, they are exhausted.
It is common to support multiple iterators by returning a new object for the ITER call, and a single iterator typically means that an object returns itself.

>> Dictionary View iterators
In Python3.0, the keys, values, and items methods of the dictionary return an iterative view object that produces one result item at a time instead of producing the full list of results once in memory. View items remain in the same physical order as those in the dictionary, and reflect changes made to the underlying dictionary.

As with all iterators, we can always force a real list by passing a Python3.0 dictionary view to the list built-in function. However, this is not usually necessary.

In addition, the Python3.0 dictionary still has its own iterator, which returns a contiguous key. Therefore, you do not need to invoke keys directly in this environment:

For key in D:print (key,end= ")

>> List parsing and map
The list resolves a value on a sequence by applying an arbitrary expression that collects its results into a new list and returns. Syntactically, list parsing is wrapped in square brackets (to remind you that they construct a list). Their simple form is to write an expression in square brackets, after which Python collects the results of each iteration of the application loop of the expression. For example, if we want to collect ASCII code for all the characters in a string, we can do this:

#循环的方法res =[]for x in ' spam ': Res.append (ord (x)) #map函数的方法res =list (Map (ord, ' spam ')) #列表解析res =[ord (x) for x in ' spam ']

>> adding tests and nesting loops
In fact, the list parsing is more general than the above, we can write an if branch after for, to increase the selection logic.

#列表解析 [x * * 2 for x in range] if x 2 = = 0] #maplist (map (lambda x:x**2), filter ((lambda x:x% 2==0), range (10)))

The above two lines of code are collected 0~9 in the sum of even squares, it is obvious to see, complete the same function, the list of parsing the statement is much simpler.

In fact, list parsing can be more generic. You can write any number of nested for loops in a list resolution, and each has an optional associated if test. The general structure is as follows:

Expression for Target1 in Iterable1 [if Comdition1] for      Target2 in iterable2 [if Condition2] ...      For TARGETN in Iterablen [if Conditionn]

When a FOR clause is nested in list resolution, they work like an equivalent nested for loop statement. For example, the following code:

Res=[x+y for x in [0,1,2] for y in [100,200,300]]

Has the same effect as the following lengthy code:

Res=[]for x in [0,1,2]: for y in [100,200,30]:  res.append (X+y)

>> list parsing and matrices
A basic way to write a matrix using Python is to use a nested list structure, for example, the following code defines two 3x3 matrices:

m=[[1,2,3],  [4,5,6],  [7,8,9]]n=[[2,2,2],  [3,3,3],  [4,4,4]]

List parsing is also a powerful tool for dealing with such structures, and they can automatically scan rows and columns.

Remove all elements from the second column:

[Row[1] for row in M]   #[2,5,8][m[row][1] for row in (0,1,2)]  #[2,5,8]

Remove the elements on the diagonal:

[M[i][i] for I in range (len (M))] #[1,5,9]

Mixing multiple matrices, the following code creates a single-layer list that contains the product of the matrix on the element.

Copy the Code code as follows:

[M[row][col] * N[row][col] for row in range (3) for Col in Range (3)] #[2,4,6,12,15,18,28,32,36]

The following code is a little more complicated, constructing a nested list with the same values as above:

Copy the Code code as follows:

[[M[row][col] * N[row][col] for col in Range (3)] for row in range (3) [#[[2,4,6],[12,15,18],[28,32,36]]

The last one above is more difficult to understand, and it is equivalent to the following statement-based code:

Res=[]for row in range (3): tmp=[] for col in range (3):  tmp.append (M[row][col]*n[row][col]) res.append (TMP)

>> Understanding List Parsing
Based on the tests running under the current Python version, the map call is twice times faster than the equivalent for loop, and the list resolution tends to be slightly faster than the map call. The speed gap comes from the underlying implementation, where map and list parsing runs at the speed of the C language in the interpreter, much faster than the python for loop stepping in PVM.

Revisit iterators: Generators

Today's Python provides more support for latency-it provides the tools to produce results when needed, rather than producing results immediately. In particular, there are two language constructs that can delay the creation of results as much as possible.

Generator functions: Write as regular def statements, but use the yield statement to return one result at a time, suspending and continuing their state between each result.
Builder Expressions: Generator expressions are similar to the list resolution of the previous section, but they return an object that produces results on demand instead of building a list of results.
Because neither of the above builds a single list at a time, they save memory space and allow the calculation to spread across the result requests.

>> generator function: Yield VS return
The functions we wrote before are general functions that accept input parameters and immediately return a single result, but there is also the ability to write functions that return a value and then proceed from where it left off. Such functions are called generator functions because they produce a sequence of values over time.

In general, generator functions are the same as regular functions, and are actually written with regular def statements. Then, when created, they automatically implement the iterative protocol so that it can appear in the iteration background.

Status hangs

Unlike a function that returns a value and exits, the generator function automatically hangs at the moment the value is generated and resumes execution of the function. Therefore, they are useful for calculating the entire series of values in advance and for manually saving and recovering states in a class. Because the state saved at the time the generator functions are suspended contains their entire local scope, their local variables persist information and make them available when the function resumes.

The main difference between a generator function and a regular function is that the generator yields a value instead of a return value. The yield statement suspends the function and sends a value to the caller, but retains enough state to allow the function to continue from where it left off. When it continues, the function returns immediately on the previous yield to continue execution. From a functional point of view, this allows its code to produce a series of values over time, rather than calculating them at once and returning them in content such as lists.

Iterative Protocol Integration

An iterative object defines a __next__ method that either returns the next item in the iteration or throws a special Stopiteration exception to terminate the iteration. An iterator for an object is accepted with the ITER built-in function.

If this protocol is supported, Python's for loop and other iterative techniques use this iterative protocol to facilitate a sequence or value generator, or, if not, to return to the repeating index sequence.

To support this protocol, the function contains a yield statement that specifically compiles the generator. When called, they return an iterator object that supports an interface that continues execution with an automatically created method named __next__. The generator function may also have a return statement, which always terminates the generation of the value directly at the end of the DEF statement block.

Generator function Application

def gensquares (n): For I in range (n):  yield i * * 2

This function produces a value each time it is looped, and then returns it to its caller. When it is tentative, its previous state is saved and the controller is recovered immediately after the yield statement. It allows the function to avoid doing all the work temporarily, which is especially important when the list of results is large or when it takes a lot of time to process each result. The generator processes the time distribution of a series of values in the loop iteration.

Extension Generator function protocol: Send and Next in Python2.5, a Send method is added to the Generator function protocol. The Send method produces the next element of a series of results, like the __next__ method, but it provides a way for the caller to communicate with the generator, which can affect its operation.

Def Gen (): For I in range:  x =yield i  print (x) G = Gen () Next (g)     #0G. Send (    1g.send) #77    #88 2 Next (G)     #None 3

The above code is more difficult to understand, and the translation of the book is relatively inferior, not understand. On the Internet to find some information, combined with their own understanding, the above code should be the operation of the process should be: A function object was generated, assigned to G, and then called the next () function, generated the generator's first value 0, so the return value is 0. The function runs to the yield statement, suspends the function immediately after encountering the yield statement, saves the state, and waits for the next iteration. The program then calls the Send () method, passing 77 to the yield statement, and the yield statement assigns the value passed by send (here is 77) to x and prints it out. The function then continues to run until it encounters yield again, which is the second time it encounters yield, so it returns 1, and then the function is suspended for the next iteration. It then calls send (), which, like the last call, assigns the passed-in parameter (here, 88) to x as the return value of yield, prints it, and then continues to run the function until yield is hit again, which is the third time, so the output is 2. Finally again the next () function is called, in fact the ' next () ' function is passed a none, so we get the result is none and 3.

It is important to note that next () and send (None) are actually equivalent. With the Send () method, we are able to communicate with the generator.

>> Builder expression: iterator encounters list resolution
In the latest version of Python, the concept of iterators and list parsing forms a new feature of this language-the generator expression. In syntax, generator expressions are like General List parsing, but they are expanded in parentheses rather than square brackets.

[x * * 2 for X in range (4)]  #List Comprehension:build a List (x * * 2 for X in range (4))  #Genterator Expression:make an iterable

The generator expression is quite different from the execution: instead of building the result in memory, it returns a generator object that will support the iterative protocol.

Generator expressions are generally considered to be optimizations for memory space, and they do not need to be structured as a list of square brackets to construct the entire list of results at once. They may be slightly slower to run in practice, so they may be the best choice for operations with very large result sets.

>> generator function vs Builder expression
Both the generator function and the generator expression itself are iterators, and thus support only one active iteration, and we cannot have multiple iterators at different locations in the result set.

Python3.0 Analytic Grammar generalization

We've focused on list parsing and generators in this chapter, but don't forget that there are two types of parsing expressions available in Python3.0: Set parsing and dictionary parsing.

[X*x for X in range (Ten)]   #List comprehension:build List (x*x for x in range)   #Generator expression:produces items{x*x for x in range (Ten)}
   #Set comprehension:new in 3.0{x:x*x for x in range  #Directionary comprehension:new in 3.0

It is important to note that the last two expressions above are all constructed at once for all objects, and they do not have the concept of producing results as needed.

Summarize

List parsing, set parsing, and dictionary parsing are all one-time constructed objects that are returned directly.
Generator functions and generator expressions do not build results in memory, they return a generator object that supports iterative protocols that produce results based on the caller's needs.
Set parsing and dictionary parsing support nested related if clauses to filter elements from the results.
function traps

>> local variables are static detection
Python defines variable names that are assigned in a function by default to local variables, which exist at the scope of the function and exist only when the function is run. Python detects Python's local variables statically, and does not detect them at run time by discovering assignment statements when compiling def code. The assigned variable name is treated as a local variable inside the function, rather than just after the assignment, the statement is considered a local variable.

>> functions with no return statement
In the Python function, the return (and yield) statement is optional. Technically, all functions return a value, and if no return statement is provided, the function will automatically return the None object.

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.