API-related work is challenging, and maintaining the stability and robustness of the system at peak times is one of the reasons why we do a lot of stress testing at Mailgun.
Over the years, we've tried many different ways, from simple apachebench to more sophisticated custom test sets. But this post is a way to use Python for "fast rough" but very flexible stress testing.
When using Python to write HTTP clients, we all like to use the requests library. This is what we recommend to our API users. Requests is very powerful, but has one drawback, it's a modular thread that calls something, it's hard or impossible to use it to quickly generate thousands of levels of requests.
Treq on Twisted Introduction
To solve this problem we introduced the Treq (GitHub library). Treq is an HTTP client library that is affected by requests, but it runs on twisted and has twisted typical power: It is asynchronous and highly concurrent when dealing with network I/O.
Treq is not confined to stress testing: it is a good tool for writing high-concurrency HTTP clients, such as Web crawling. The treq is elegant, easy to use and powerful. This is an example:
>>> from treq import get >>> def done (response): ... Print Response.code ... Reactor.stop () >>> get ("http://www.github.com"). Addcallback (done) >>> from Twisted.internet Import Reactor 200
A simple test script
The following is a simple script that uses Treq to bombard a single URL with the maximum possible amount of requests.
#!/usr/bin/env python from twisted.internet import Epollreactor epollreactor.install () from twisted.internet import rea ctor, Task from twisted.web.client import httpconnectionpool import treq import random from datetime import datetime req _generated = 0 Req_made = 0 Req_done = 0 cooperator = task. Cooperator () Pool = Httpconnectionpool (reactor) def counter (): "This function gets called once a second and prints The progress at one second intervals. "Print (" requests: {} generated; {} made; {} done ". Format (req_generated, Req_made, Req_done)) # Reset the counters and reschedule ourselves req_generated = Req_made = Req_done = 0 Reactor.calllater (1, counter) def body_received (body): global Req_done Req_done + = 1 def request_done (response): global Req_made deferred = treq.json_content (response) Req_made + = 1 deferred.addcallb ACK (body_received) deferred.adderrback (Lambda x:none) # Ignore errors return deferred Def request (): Deferred = Treq.post (' Http://api.host/v2/loadtest/messages ', auth= (' API ', ' Api-key '), data={' from ': ' Loa Dtest
'
to ': ' to@example.org ', ' Subject ': ' Test '}, Pool=pool) DEFERRED.ADDC Allback (Request_done) return deferred def requests_generator (): Global req_generated while true:deferred = RE Quest () Req_generated + = 1 # do not yield deferred it cooperator won ' t pause until # response is received Yield None if __name__ = = ' __main__ ': # make cooperator work on spawning requests cooperator.cooperate (requests _generator ()) # Run the counter that'll be reporting sending speed once a second reactor.calllater (1, counter) # Run the reactor Reactor.run ()
Output Result:
2013-04-25 09:30 requests:327 generated; 153 sent; 153 received 2013-04-25 09:30 requests:306 generated; 156 sent; 156 received 2013-04-25 09:30 requests:318 generated; 184 sent; 154 received
The number for the "Generated" class represents a request prepared by the twisted reactor but not yet sent. This script ignores all error handling for brevity. The information that adds a timeout state to it is left to the reader as an exercise.
This script can be used as a starting point, and you can customize the processing logic for specific applications by extending it. It is recommended that you use collections when improving. Counter to replace the ugly global variables. This script runs on a single thread and wants to squeeze the largest number of requests out of a machine, you can use a technique similar to mulitprocessing.
May you be happy in the stress test!