Python twisted asynchronous collection program code

Source: Internet
Author: User
Tags semaphore

For a large number of data acquisition in addition to multithreading, it is only asynchronous to achieve. This paper is based on the twisted framework to achieve asynchronous acquisition,

Async Batching with twisted:a Walkthrough

Example 1:just a Defferedlist

  code is as follows copy code
from Twisted.internet Import reactor
from twisted.web.client import getpage
from Twisted.internet.defer Import Deferredlist
def listcallback (results):
  Print results
def finish (IGN):
  reactor.stop ()
def test ():
  D1 = getpage (' http://www.111cn.net ')
  D2 = getpage (' http://yahoo.com ')
  DL = De Ferredlist ([D1, D2])
  Dl.addcallback (listcallback)
  Dl.addcallback (finish)
Test ()
Reactor.run ()

This is one of the simplest examples you ll ever to the for a deferred list in action. Get two deferreds (the GetPage function returns a deferred) and use them to created a deferred list. ADD callbacks to the list, garnish with a lemon.

Example 2:simple Result Manipulation

  code is as follows copy code
from twisted.internet import reactor
from twisted.web.client import getpage
from Twisted.internet.defer import deferredlist
def listcallback (results):
  to issuccess, content in R Esults:
    print successful? %s '% issuccess
    print ' content Length:%s '% len (content)
Def finish (IGN):
  REACTOR.S Top ()
Def Test ():
  D1 = getpage (' http://www.111cn.net ')
  D2 = getpage (' http://yahoo.com ')
&nb Sp DL = Deferredlist ([D1, D2])
  Dl.addcallback (listcallback)
  Dl.addcallback (finish)
Test ()
Reactor.run ()

We make things a little more interesting in this example by doing some on the processing. For this to make sense, just remember that a callback gets passed the "result" when the deferred action completes. If we have the API documentation for Deferredlist, we are returns a list of (success, result) tuples, where suc Cess is a Boolean and result are the result of the "a" deferred that were put in the list (remember, we ' ve got two layers of defer Reds here!).

Example 3:page Callbacks Too

  code is as follows copy code
from twisted.internet import reactor
from twisted.web.client import getpage
from Twisted.internet.defer import deferredlist
def pagecallback [result]:
  return len (result)
Def Listcallback (Result):
  Print result
def finish (IGN):
  Reactor.stop ()
def Test ():
  D 1 = getpage (' http://www.111cn.net ')
  D1.addcallback (pagecallback)
  D2 = getpage (' http://yahoo.com '
  D2.addcallback (pagecallback)
  DL = Deferredlist ([D1, D2])
  Dl.addcallback (listcallback
  Dl.addcallback (finish)
Test ()
Reactor.run ()

Here, we mix things up a little bit. Instead of doing processing on the results at once (in the deferred list callback), we ' re processing them when the Pag E callbacks fire. Our processing are just a simple example of getting the length of getpage deferred result:the HTML content of the page at the given URL.

Example 4:results with more Structure

The code is as follows Copy Code
From twisted.internet Import reactor
From twisted.web.client import GetPage
From Twisted.internet.defer import deferredlist
def pagecallback (Result):
data = {
' Length ': Len (Result),
' content ': result[:10],
}
Return data
def listcallback (Result):
For issuccess, data in result:
If issuccess:
Print "Call to server succeeded with data%s"% str (data)
def finish (IGN):
Reactor.stop ()
def test ():
D1 = getpage (' http://www.111cn.net ')
D1.addcallback (Pagecallback)
D2 = GetPage (' http://yahoo.com ')
D2.addcallback (Pagecallback)
DL = Deferredlist ([D1, D2])
Dl.addcallback (Listcallback)
Dl.addcallback (Finish)
Test ()
Reactor.run ()

A follow-up to the last example, we'll put the "data in which" we are interested into a dictionary. We don ' t end up pulling any of the ' the ' dictionary; We just stringify it and print it to stdout.

Example 5:passing Values to callbacks

The code is as follows Copy Code
From twisted.internet Import reactor
From twisted.web.client import GetPage
From Twisted.internet.defer import deferredlist
def pagecallback (result, URL):
data = {
' Length ': Len (Result),
' content ': result[:10],
' url ': URL,
}
Return data
def getpagedata (URL):
d = getpage (URL)
D.addcallback (pagecallback, URL)
Return D
def listcallback (Result):
For issuccess, data in result:
If issuccess:
Print "Call to%s succeeded with data%s"% (data[' url '), str (data)
def finish (IGN):
Reactor.stop ()
def test ():
D1 = getpagedata (' http://www.111cn.net ')
D2 = Getpagedata (' http://yahoo.com ')
DL = Deferredlist ([D1, D2])
Dl.addcallback (Listcallback)
Dl.addcallback (Finish)
Test ()
Reactor.run ()

After the all this playing, we start asking ourselves more serious questions, like: "I want to decide which values My callbacks "or" Some information this is available here, isn ' t available there. How does I get it there? " This are how:-) Just Pass the parameters your want to your callback. They ' ll be tacked in after the result (as you can, the function signatures).

In this example, we needed to create our own deferred-returning function, one of that wraps the GetPage function so that we C An also pass the URL on to the callback.

Example 6:adding Some Error Checking

The code is as follows Copy Code
From twisted.internet Import reactor
From twisted.web.client import GetPage
From Twisted.internet.defer import deferredlist
URLs = [
' Http://yahoo.com ',
' Http://www.111cn.net ',
' Http://www.111cn.net/MicrosoftRules.html ',
' Http://bogusdomain.com ',
]
def pagecallback (result, URL):
data = {
' Length ': Len (Result),
' content ': result[:10],
' url ': URL,
}
Return data
def pageerrback (Error, URL):
return {
' msg ': Error.geterrormessage (),
' Err ': Error,
' url ': URL,
}
def getpagedata (URL):
d = getpage (URL, timeout=5)
D.addcallback (pagecallback, URL)
D.adderrback (pageerrback, URL)
Return D
def listcallback (Result):
For ignore, data in result:
If Data.has_key (' Err '):
Print "Call to%s failed with data%s"% (data[' url '), str (data)
Else
Print "Call to%s succeeded with data%s"% (data[' url '), str (data)
def finish (IGN):
Reactor.stop ()
def test ():
Deferreds = []
For URL in URLs:
d = getpagedata (URL)
Deferreds.append (d)
DL = Deferredlist (deferreds, Consumeerrors=1)
Dl.addcallback (Listcallback)
Dl.addcallback (Finish)
Test ()
Reactor.run ()

As we get closer to building real applications and we start getting concerned about things like catching/anticipating. We haven ' t added any errbacks to the deferred list, but we have added one to our page callback. We ' ve added more URLs and put them in a list to ease the pains of duplicate code. As you can, two of the URL should return errors:one a 404, and the other should being a domain not resolving (we ll This as a timeout).

Example 7:batching with Deferredsemaphore

  code is as follows copy code
from Twisted.internet Import reactor
from twisted.web.client import getpage
from twisted.internet import defer
Max Run = 1
urls = [
  ' http://twistedmatrix.com ',
  ' http://twistedsoftwarefoundation.org ',
  ' http://yahoo.com ',
  ' http://www.111cn.net ',
 ]
def listcallback (results):
  For issuccess, result in results:
    print len (result)
Def finish (IGN):
  Reactor.stop ()
def Test ():
  deferreds = []
  SEM = defer. Deferredsemaphore (Maxrun)
  for URL in URLs:
    d = sem.run (getpage, url)
  &N Bsp Deferreds.append (d)
  DL = defer. Deferredlist (deferreds)
  Dl.addcallback (listcallback)
  Dl.addcallback (finish)
Test ()
Reactor.run ()

These last two examples are for the more advanced use cases. As soon as the reactor starts, deferreds that are ready, start "firing"-their "Jobs" start running. What if we ' ve got deferreds in a list? So, they all start processing. As you can imagine, this is a easy way to run a accidental DoS against a friendly service. Not cool.

For situations like this, what we want are a way to run only so many at a time. This is a great use for the deferred semaphore. When I repeated runs of the example above, the content lengths of the four pages returned after about 2.5 seconds. With the "example rewritten to" Use just the deferred list (no deferred semaphore), the content lengths were About 1.2 seconds. The extra time was due to the fact, which I (for the sake of the example) forced only one deferred to run in a time, OBVIOUSL Y not what your ' re going to want to does for a highly concurrent task;-

Note This without changing the code and only setting Maxrun to 4, the timings for getting the content lengths is about The same, averaging for me 1.3 seconds (there's a little more overhead when using the involved deferred).

One last subtle note (in anticipation to the next example): The For loop, creates all, deferreds at once; The deferred semaphore simply limits how to many get run in a time.

Example 8:throttling with Cooperator

The code is as follows Copy Code
From twisted.internet Import reactor
From twisted.web.client import GetPage
From twisted.internet import defer, task
Maxrun = 2
URLs = [
' Http://twistedmatrix.com ',
' Http://twistedsoftwarefoundation.org ',
' Http://yahoo.com ',
' Http://www.111cn.net ',
]
def pagecallback (Result):
Print Len (Result)
return result
Def doWork ():
For URL in URLs:
d = getpage (URL)
D.addcallback (Pagecallback)
Yield D
def finish (IGN):
Reactor.stop ()
def test ():
Deferreds = []
Coop = task. Cooperator ()
Work = DoWork ()
For I in Xrange (Maxrun):
D = coop.coiterate (work)
Deferreds.append (d)
DL = defer. Deferredlist (deferreds)
Dl.addcallback (Finish)
Test ()
Reactor.run ()

Although not yet to study the level of the twisted framework, but here first recorded, for later to be savored.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.