I often need to do asynchronous request work with Python and SOLR. There is a block of code blocking the SOLR HTTP request until the first one completes before the second request is executed, the code reads as follows:
Import requests
#Search 1
solrresp = requests.get (' Http://mysolr.com/solr/statedecoded/search?q=law ')
For Doc in Solrresp.json () [' response '] [' docs ']:
print doc[' catch_line ']
#Search 2
solrresp = Requests.get (' http://mysolr.com/solr/statedecoded/search?q=shoplifting ')
For Doc in Solrresp.json () [' response '] [' docs ']:
print doc[' catch_line ']
(We use the requests library for HTTP requests)
It's good to have scripts to index documents to SOLR, and then work in parallel. I need to expand my work so the index bottleneck is SOLR rather than a network request.
Unfortunately, Python is not as handy as JavaScript or go when it comes to asynchronous programming. However, Gevent Library can help us a little. The base of the gevent is Libevent library, constructed in native asynchronous invocation (SELECT, poll, etc.), libevent good coordination of many low-level asynchronous functions.
Using Gevent is very simple, the one thing that is Tangled is Thegevent.monkey.patch_all (), for better asynchronous collaboration with Gevent, it fixes a lot of standard libraries. It sounds scary, but I'm not having a problem with this patch implementation.
Without further ado, here's what you do if you use gevents to parallel SOLR request:
Import requests from gevent Import Monkey import gevent Monkey.patch_all () class Searcher (object): "" Simple Wrapp
Er for doing a search and collecting the results "" "Def __init__ (Self, searchurl): Self.searchurl = Searchurl def search (self): Solrresp = Requests.get (self.searchurl) Self.docs = Solrresp.json () [' response '] [' docs '] D
EF searchmultiple (URLs): "" "Use Gevent to execute the passed in URLs; Dump the results "" "Searchers = [Searcher (URL) for URL in URLs] # Gather a handle to each task handles = [] for Searcher in Searchers:handles.append (Gevent.spawn (Searcher.search)) # blocks until all work are done gevent.join
All (handles) # Dump the results for searcher in Searchers:print ' Search results for%s '% Searcher.searchurl For doc in Searcher.docs:print doc[' catch_line '] searchurls = [' Http://mysolr.com/solr/statedecoded/search?q=la W ', ' http://mysolr.com/solr/statedecoded/search?q=shoplifting ']
Searchmultiple (Searchurls)
The code is increased, and the JavaScript code is less concise than the same functionality, but it does the work, and the essence of the code is the following lines:
# Gather A handle for the each task
handles = [] for
searcher in Searchers:
handles.append (gevent.spawn. Search) # block until all work are done
Gevent.joinall (handles)
We let gevent produce searcher.search, we can operate on the resulting tasks, and then we can wait for all the resulting tasks to complete, and finally export the results.
That's about it. If you have any ideas please leave us a message. Let us know how we can help with your SOLR search application.