Python twisted asynchronous collection program code

Last Update:2017-01-13 Source: Internet

Author: User

Tags semaphore

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

For a large number of data acquisition in addition to multithreading, it is only asynchronous to achieve. This paper is based on the twisted framework to achieve asynchronous acquisition,

Async Batching with twisted:a Walkthrough

Example 1:just a Defferedlist

code is as follows	copy code
from Twisted.internet Import reactor from twisted.web.client import getpage from Twisted.internet.defer Import Deferredlist def listcallback (results): Print results def finish (IGN): reactor.stop () def test (): D1 = getpage (' http://www.111cn.net ') D2 = getpage (' http://yahoo.com ') DL = De Ferredlist ([D1, D2]) Dl.addcallback (listcallback) Dl.addcallback (finish) Test () Reactor.run ()

This is one of the simplest examples you ll ever to the for a deferred list in action. Get two deferreds (the GetPage function returns a deferred) and use them to created a deferred list. ADD callbacks to the list, garnish with a lemon.

Example 2:simple Result Manipulation

code is as follows

copy code

from twisted.internet import reactor
from twisted.web.client import getpage
from Twisted.internet.defer import deferredlist
def listcallback (results):
to issuccess, content in R Esults:
print successful? %s '% issuccess
print ' content Length:%s '% len (content)
Def finish (IGN):
REACTOR.S Top ()
Def Test ():
D1 = getpage (' http://www.111cn.net ')
D2 = getpage (' http://yahoo.com ')
&nb Sp DL = Deferredlist ([D1, D2])
Dl.addcallback (listcallback)
Dl.addcallback (finish)
Test ()
Reactor.run ()

We make things a little more interesting in this example by doing some on the processing. For this to make sense, just remember that a callback gets passed the "result" when the deferred action completes. If we have the API documentation for Deferredlist, we are returns a list of (success, result) tuples, where suc Cess is a Boolean and result are the result of the "a" deferred that were put in the list (remember, we ' ve got two layers of defer Reds here!).

Example 3:page Callbacks Too

code is as follows

copy code

from twisted.internet import reactor
from twisted.web.client import getpage
from Twisted.internet.defer import deferredlist
def pagecallback [result]:
return len (result)
Def Listcallback (Result):
Print result
def finish (IGN):
Reactor.stop ()
def Test ():
D 1 = getpage (' http://www.111cn.net ')
D1.addcallback (pagecallback)
D2 = getpage (' http://yahoo.com '
D2.addcallback (pagecallback)
DL = Deferredlist ([D1, D2])
Dl.addcallback (listcallback
Dl.addcallback (finish)
Test ()
Reactor.run ()

Here, we mix things up a little bit. Instead of doing processing on the results at once (in the deferred list callback), we ' re processing them when the Pag E callbacks fire. Our processing are just a simple example of getting the length of getpage deferred result:the HTML content of the page at the given URL.

Example 4:results with more Structure

The code is as follows

Copy Code

From twisted.internet Import reactor
From twisted.web.client import GetPage
From Twisted.internet.defer import deferredlist
def pagecallback (Result):
data = {
' Length ': Len (Result),
' content ': result[:10],
}
Return data
def listcallback (Result):
For issuccess, data in result:
If issuccess:
Print "Call to server succeeded with data%s"% str (data)
def finish (IGN):
Reactor.stop ()
def test ():
D1 = getpage (' http://www.111cn.net ')
D1.addcallback (Pagecallback)
D2 = GetPage (' http://yahoo.com ')
D2.addcallback (Pagecallback)
DL = Deferredlist ([D1, D2])
Dl.addcallback (Listcallback)
Dl.addcallback (Finish)
Test ()
Reactor.run ()

A follow-up to the last example, we'll put the "data in which" we are interested into a dictionary. We don ' t end up pulling any of the ' the ' dictionary; We just stringify it and print it to stdout.

Example 5:passing Values to callbacks

The code is as follows

Copy Code

From twisted.internet Import reactor
From twisted.web.client import GetPage
From Twisted.internet.defer import deferredlist
def pagecallback (result, URL):
data = {
' Length ': Len (Result),
' content ': result[:10],
' url ': URL,
}
Return data
def getpagedata (URL):
d = getpage (URL)
D.addcallback (pagecallback, URL)
Return D
def listcallback (Result):
For issuccess, data in result:
If issuccess:
Print "Call to%s succeeded with data%s"% (data[' url '), str (data)
def finish (IGN):
Reactor.stop ()
def test ():
D1 = getpagedata (' http://www.111cn.net ')
D2 = Getpagedata (' http://yahoo.com ')
DL = Deferredlist ([D1, D2])
Dl.addcallback (Listcallback)
Dl.addcallback (Finish)
Test ()
Reactor.run ()

After the all this playing, we start asking ourselves more serious questions, like: "I want to decide which values My callbacks "or" Some information this is available here, isn ' t available there. How does I get it there? " This are how:-) Just Pass the parameters your want to your callback. They ' ll be tacked in after the result (as you can, the function signatures).

In this example, we needed to create our own deferred-returning function, one of that wraps the GetPage function so that we C An also pass the URL on to the callback.

Example 6:adding Some Error Checking

The code is as follows

Copy Code

From twisted.internet Import reactor
From twisted.web.client import GetPage
From Twisted.internet.defer import deferredlist
URLs = [
' Http://yahoo.com ',
' Http://www.111cn.net ',
' Http://www.111cn.net/MicrosoftRules.html ',
' Http://bogusdomain.com ',
]
def pagecallback (result, URL):
data = {
' Length ': Len (Result),
' content ': result[:10],
' url ': URL,
}
Return data
def pageerrback (Error, URL):
return {
' msg ': Error.geterrormessage (),
' Err ': Error,
' url ': URL,
}
def getpagedata (URL):
d = getpage (URL, timeout=5)
D.addcallback (pagecallback, URL)
D.adderrback (pageerrback, URL)
Return D
def listcallback (Result):
For ignore, data in result:
If Data.has_key (' Err '):
Print "Call to%s failed with data%s"% (data[' url '), str (data)
Else
Print "Call to%s succeeded with data%s"% (data[' url '), str (data)
def finish (IGN):
Reactor.stop ()
def test ():
Deferreds = []
For URL in URLs:
d = getpagedata (URL)
Deferreds.append (d)
DL = Deferredlist (deferreds, Consumeerrors=1)
Dl.addcallback (Listcallback)
Dl.addcallback (Finish)
Test ()
Reactor.run ()

As we get closer to building real applications and we start getting concerned about things like catching/anticipating. We haven ' t added any errbacks to the deferred list, but we have added one to our page callback. We ' ve added more URLs and put them in a list to ease the pains of duplicate code. As you can, two of the URL should return errors:one a 404, and the other should being a domain not resolving (we ll This as a timeout).

Example 7:batching with Deferredsemaphore

code is as follows

copy code

from Twisted.internet Import reactor
from twisted.web.client import getpage
from twisted.internet import defer
Max Run = 1
urls = [
' http://twistedmatrix.com ',
' http://twistedsoftwarefoundation.org ',
' http://yahoo.com ',
' http://www.111cn.net ',
]
def listcallback (results):
For issuccess, result in results:
print len (result)
Def finish (IGN):
Reactor.stop ()
def Test ():
deferreds = []
SEM = defer. Deferredsemaphore (Maxrun)
for URL in URLs:
d = sem.run (getpage, url)
&NBSP;&NBSP;&N Bsp Deferreds.append (d)
DL = defer. Deferredlist (deferreds)
Dl.addcallback (listcallback)
Dl.addcallback (finish)
Test ()
Reactor.run ()

These last two examples are for the more advanced use cases. As soon as the reactor starts, deferreds that are ready, start "firing"-their "Jobs" start running. What if we ' ve got deferreds in a list? So, they all start processing. As you can imagine, this is a easy way to run a accidental DoS against a friendly service. Not cool.

For situations like this, what we want are a way to run only so many at a time. This is a great use for the deferred semaphore. When I repeated runs of the example above, the content lengths of the four pages returned after about 2.5 seconds. With the "example rewritten to" Use just the deferred list (no deferred semaphore), the content lengths were About 1.2 seconds. The extra time was due to the fact, which I (for the sake of the example) forced only one deferred to run in a time, OBVIOUSL Y not what your ' re going to want to does for a highly concurrent task;-

Note This without changing the code and only setting Maxrun to 4, the timings for getting the content lengths is about The same, averaging for me 1.3 seconds (there's a little more overhead when using the involved deferred).

One last subtle note (in anticipation to the next example): The For loop, creates all, deferreds at once; The deferred semaphore simply limits how to many get run in a time.

Example 8:throttling with Cooperator

The code is as follows

Copy Code

From twisted.internet Import reactor
From twisted.web.client import GetPage
From twisted.internet import defer, task
Maxrun = 2
URLs = [
' Http://twistedmatrix.com ',
' Http://twistedsoftwarefoundation.org ',
' Http://yahoo.com ',
' Http://www.111cn.net ',
]
def pagecallback (Result):
Print Len (Result)
return result
Def doWork ():
For URL in URLs:
d = getpage (URL)
D.addcallback (Pagecallback)
Yield D
def finish (IGN):
Reactor.stop ()
def test ():
Deferreds = []
Coop = task. Cooperator ()
Work = DoWork ()
For I in Xrange (Maxrun):
D = coop.coiterate (work)
Deferreds.append (d)
DL = defer. Deferredlist (deferreds)
Dl.addcallback (Finish)
Test ()
Reactor.run ()

Although not yet to study the level of the twisted framework, but here first recorded, for later to be savored.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More