tutorial on using composite functions in the Itertools module in Python _python

Source: Internet
Author: User
Tags imap iterable terminates

Understanding New Concepts

The idea of iterators is introduced into the Python V2.2. Well, that's not quite true; the "signs" of this idea have long been in the older function xrange () and the file method. Xreadlines (). By introducing the yield keyword, Python 2.2 promotes this concept in many aspects of internal implementation and makes programming custom iterators much simpler (the presence of yield transforms functions into generators, and the generator returns iterators in turn).

The motivation behind the iterator is two-fold. It is usually easiest to treat data as a sequence, whereas a sequence that is processed in a linear order usually does not need to exist at the same time.

x* () Precursors provide a clear example of these principles. If you want to perform thousands of times on an operation, it may take some time to execute your program, but the program generally does not require large amounts of memory. Similarly, for many types of files, you can handle them one line at a line without having to store the entire file in memory. It is best to be lazy on all other kinds of sequences, depending on the data that is gradually arriving through the channel, or on a step-by-step calculation.

Most of the time, the iterator is inside the for loop, just like the real sequence. The iterator provides a. Next () method that can be invoked explicitly, but with 99% of the possible, what you see is the following line:

For x in iterator:
  do_something_with (x)

The loop is terminated when a Stopiteration exception is generated when the Iterator.next () is invoked behind the scenes. By the way, by invoking ITER (SEQ), the actual sequence can be converted to an iterator-this does not save any memory, but it is useful in the functions discussed below.

Python's evolving character of fragmentation

Python's view of functional programming (FP) is somewhat contradictory. On the one hand, many Python developers despise the traditional FP function map (), filter (), and reduce () and often suggest using "list understanding" instead. But the complete Itertools module is precisely composed of functions that are exactly the same type as these functions, except that they operate on the "inert sequence" (iterator) rather than on the complete sequence (list, tuple). Also, there is no syntax for "iterator understanding" in Python 2.3, which seems to have the same motivation as list understanding.

I guess Python eventually produces some form of iterator understanding, but it depends on finding the natural syntax appropriate to them. At the same time, in the Itertools module, we have a lot of useful combination functions. Roughly, each of these functions takes some parameters (usually including some underlying iterators) and returns a new iterator. For example, the function IFilter (), IMAP (), and Izip () are directly equivalent to the missing word first I built-in functions.

Missing equivalence function

There is no ireduce () in the Itertools, although it is natural to have this function; possible Python implementations are:
Listing 1. Sample implementation of Ireduce ()

Def ireduce (func, Iterable, Init=none):
  if Init is None:
    iterable = iter (iterable)
    Curr = Iterable.next () 
   else:
    curr = init for
  x in iterable:
    Curr = func (Curr, x)
    yield Curr

The use case of Ireduce () is similar to the use case of reduce (). For example, suppose you want to add a column of numbers that a large file contains, but stop when a condition is met. You can use the following code to monitor the number of totals that are being calculated:
Listing 2. Add and total a column number

From operator import add
to itertools import *
nums = open (' numbers ')
for tot in takewhile (condition, Ireduc E (ADD, IMAP (int, nums)):
  print "total =", tot

A more practical example might be similar to applying an event flow to a stateful object, such as applying to a GUI window widget. But even the simple example above shows the FP style of the iterator combo.

Basic Iterator Factory

All functions in Itertools can be easily implemented in pure Python for a living builder. The key to including the module in the Python 2.3+ is to provide a standard behavior and name for some useful functions. Although programmers can write their own versions, everyone actually creates variants that are somewhat incompatible. However, the other is to implement an iterator combo with efficient C code. Using the Itertools function is a little faster than writing your own combo. The standard document shows the equivalent pure Python implementation of each itertools function, so you don't need to repeat it in this article.

The functions in Itertools are basically the same-and the names are completely different-so it might make sense to import all the names from the module. For example, a function enumerate () might appear obviously in Itertools, but it is a built-in function in the Python 2.3+. It is worth noting that you can easily express enumerate () with the Itertools function:

From itertools import *
enumerate = lambda iterable:izip (count (), iterable)

Let's take a look at a couple of itertools functions that don't base the other iterators, but simply create iterators from scratch. The Times () returns an iterator that produces the same object more than once; in essence, this capability is useful, but it does make a good substitute for using too many xrange () and index variables, so you can simply repeat an action. That is, you do not have to use:

For I in xrange (1000):
  do_something ()

You can now use more neutral:

For _ In times (1000):
  do_something ()

If the Times () has only one argument, it will only duplicate None. The function repeat () is similar to The Times (), but it returns the same object without bounds. This iterator is useful either in loops that contain separate break conditions or in a combo like Izip () and IMAP ().

The function count () is somewhat analogous to the intersection of repeat () and xrange (). COUNT () returns a contiguous integer (starting with an optional argument) without bounds. However, if count () does not currently support overflow to the right longs, you may still want to use xrange (n,sys.maxint); it's not completely unbounded, but for most purposes it's actually one thing. Similar to repeat (), count () is particularly useful inside other iterator assemblies.

Combining functions

We have, incidentally, mentioned several actual combination functions in Itertools. The role of IFilter (), Izip (), and IMAP () is just as you would expect from their corresponding sequence functions. Ifilterfalse () is convenient, so you don't need to remove the predicate functions in the Lambda and Def (and this also saves a lot of function call overhead). But functionally, you can define IFILTERFALSE () as (roughly, ignoring the None predicate):

def ifilterfalse (predicate, iterable): Return
  IFilter (lambda predicate:not predicate, iterable)

The function dropwhile () and function takewhile () divide the iterator according to the predicate. Dropwhile () TakeWhile () terminates when a predicate is satisfied by ignoring the resulting element until a predicate is satisfied. Dropwhile () skips an indefinite number of initial elements of an iterator, so it may not begin iterating until after a delay. TakeWhile () begins the iteration immediately, but terminates the iterator if the predicate that is passed in becomes true.

The function islice () is basically the iterator version of the list fragment. You can specify start, stop, and step lengths, just as you would use regular slices. If a start is given, a large number of elements are deleted until the passed iterator reaches the element that satisfies the condition. This is another scenario that I think can be improved on Python-the iterator is best to recognize only the slices, just as the list does (as a synonym for the islice () behavior).

The last function Starmap () slightly changes on the IMAP () basis. If the function passed as a parameter gets more than one argument, the iterable that is passed produces a tuple of the appropriate size. This is essentially the same IMAP () that contains multiple incoming iterable, except that it contains the Iterables collection that was previously combined with Izip ().

In-depth discussion

The functions contained in the Itertools are a good start. Without other functions, using these functions alone will make it easier for Python programmers to leverage and combine iterators. In general, the widespread use of iterators is undoubtedly important to the future of Python. But in addition to what was contained in the past, I would like to make a few suggestions for the future update of the module. You can quickly and easily use these functions-of course, if they were later included, the name or interface details would be different.

One category that may be very common is a function that takes multiple iterable as arguments and then generates individual elements from each iterable. In contrast, izip () produces element tuples, while IMAP () produces values computed from the base element. The two arrangements that are clear in my mind are chain () and weave (). The first is similar in effect to sequence and set (but somewhat inert). That is, in the pure sequence that you might use, for example:

For x in (' A ', ' B ', ' C ') + (1, 2, 3):
  do_something (x)

For general Iterables, you can use:

For x in Chain (Iter1, Iter2, Iter3):
  do_something (x)

Python implementations are:
Listing 3. Sample implementation of Chain ()

def chain (*iterables): For
  iterable in Iterables: For
    item in iterable:
      yield Item

With Iterables, you can also combine several sequences by making them dispersed. There is no built-in syntax for doing the same thing with the sequence, but the weave () itself works well for the complete sequence. The following are possible implementations (Magnus Lie Hetland presents a similar function of Comp.lang.python):
Listing 4. Sample implementation of Weave ()

def weave (*iterables):
  "intersperse several iterables, until all are exhausted"
  iterables = Map (iter, Iterables)
  while Iterables:
    to I, it in Enumerate (iterables):
      try:
        yield it.next () except
      stopiteration:
        del Iterables[i]

Let me show you the behavior of weave (), because it's not obvious from an implementation perspective:

>>> for x in weave (' abc ', Xrange (4), [10,11,12,13,14]):
...  Print x,
...
A 0 B 1 one C 2 12 13 3 14

Even though some iterators reach the end point, the rest of the iterator continues to produce the value until all the available values are generated at some point.

I will only propose a workable itertools function. This function is mainly inspired by the functional programming method of the conceptual problem. Icompose () has some symmetry with the proposed function Ireduce (). However, when Ireduce () passes a value (inert) sequence to a function and produces each result, Icompose () applies the sequence of functions to the return value of each forward function. You can use Ireduce () to pass event sequences to objects that are long active. Icompose () may pass the object repeatedly to the assignment function that returns the new object. The first approach is a fairly traditional OOP approach to consider events, while the second is more akin to FP.

The following are possible Icompose () implementations:
Listing 5. Sample implementation of Icompose ()

Def icompose (Functions, Initval):
  currval = Initval for
  f in functions:
    currval = f (currval)
    yield Currval

Conclusion

An iterator-considered an inert sequence-is a powerful concept that opens the new style of Python programming. But there is a subtle difference between taking an iterator as a data source and considering it as a sequence. Neither of these ideas is inherently more correct than the other, but the latter pioneered a combinatorial shorthand for manipulating programming events. The combination functions in itertools, especially the functions that it may produce that are similar to what I suggest, are close to the programmatic declaration style. For me, these declarative styles are generally more error-prone and more powerful.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.