This article describes how to execute an asynchronous Solr query in Python's gevent framework. Solr requests are more efficient in IO processing, for more information, see. I often need to use Python and solr for asynchronous request work. Here, some code blocks the Solr http request. The second request will not be executed until the first request is completed. the code is as follows:
import requests #Search 1solrResp = requests.get('http://mysolr.com/solr/statedecoded/search?q=law') for doc in solrResp.json()['response']['docs']: print doc['catch_line'] #Search 2solrResp = requests.get('http://mysolr.com/solr/statedecoded/search?q=shoplifting') for doc in solrResp.json()['response']['docs']: print doc['catch_line']
(We use the Requests library for http Requests)
It is good to index documents to Solr through scripts and then work in parallel. I need to expand my work, so the index bottleneck is Solr, not network requests.
Unfortunately, python is not as convenient as Javascript or Go for asynchronous programming. However, the gevent library can help us. The bottom layer of gevent is the libevent Library, which is built on native asynchronous calls (select, poll, and other original asynchronous calls). libevent can well coordinate many low-layer asynchronous functions.
It is easy to use gevent. thegevent. monkey. patch_all (). to better coordinate with gevent asynchronously, it fixes many standard libraries. It sounds terrible, but I have not encountered any problems when using this patch.
If you use gevents to parallel Solr requests:
import requestsfrom gevent import monkeyimport geventmonkey.patch_all() class Searcher(object): """ Simple wrapper for doing a search and collecting the results """ def __init__(self, searchUrl): self.searchUrl = searchUrl def search(self): solrResp = requests.get(self.searchUrl) self.docs = solrResp.json()['response']['docs'] def searchMultiple(urls): """ Use gevent to execute the passed in urls; dump the results""" searchers = [Searcher(url) for url in urls] # Gather a handle for each task handles = [] for searcher in searchers: handles.append(gevent.spawn(searcher.search)) # Block until all work is done gevent.joinall(handles) # Dump the results for searcher in searchers: print "Search Results for %s" % searcher.searchUrl for doc in searcher.docs: print doc['catch_line'] searchUrls = ['http://mysolr.com/solr/statedecoded/search?q=law', 'http://mysolr.com/solr/statedecoded/search?q=shoplifting']
SearchMultiple (searchUrls)
The code is added, and the Javascript code with the same function is not as concise, but it can complete the corresponding work. The essence of the code is the following lines:
# Gather a handle for each taskhandles = []for searcher in searchers: handles.append(gevent.spawn(searcher.search)) # Block until all work is donegevent.joinall(handles)
We allow gevent to generate searcher. search. we can operate on the generated tasks, and then we can wait for all the generated tasks to complete and finally export the results.
This is almost the case. if you have any ideas, please leave us a message. Let us know how we can help your Solr search application.