Today want to use Google's spelling-suggestion function, that is, "you are not looking for: *" or "Did you mean *" this function.
The result is a search on the internet, most of them have stopped the service method. That is, the "/tbproxy/spell?lang=en" related problem.
As shown in the reference to site 1. The approach has been deactivated by Google.
Later found the reference page 2: Using Google Custom Search API for spelling check
Detailed reading of the various instructions inside (see reference Page 3), the test on the Web page was finally successful. For example, the input "Ding Junhui", will give the corresponding suggestion, for "Ding."
As shown in the reference page 4.
Later in Python, we found that the results returned were all & #x开头的特殊编码方式 and needed to be converted to Chinese characters. Reference page 5 to achieve success.
The detailed code is as follows:
#-*-Coding:utf-8-*-
Import time
import OS
import sys import
re
import urllib2,urllib
Import Htmlparser
Reload (SYS)
sys.setdefaultencoding (' Utf-8 ')
class Getgooglesuggestion:
def __init__ ( Self):
self.cx = ' 012080660999116631289:zlpj9ypbnii '
def getsuggestion (self,query):
url = (' http:// Www.google.com/search? '
q=%s '
&hl=zh '
&output=xml ' &client=google-csbe '
&cx=%s ')% (urllib.quote (query), self.cx)
request = Urllib2. Request (URL, None)
response = Urllib2.urlopen (Request). Read ()
h= htmlparser.htmlparser ()
print ( H.unescape (response))
if __name__== ' __main__ ':
test = getgooglesuggestion ()
keyword = ' Ding Junhui '
Test.getsuggestion (keyword)
The resulting results are shown below, for XML.
<?xml version= "1.0" encoding= "iso-8859-1" standalone= "no"?> <! DOCTYPE GSP SYSTEM "Google.dtd" > <gsp ver= "3.2" > <error>403</error><tm>0.028615</tm ><Q> Ding Jun </Q> <param name= "Q" value= "Ding Junhui" original_value= "%e4%b8%81%e5%86%9b%e8%be%89" url_escaped_ Value= "%e4%b8%81%e5%86%9b%e8%be%89" js_escaped_value= "Ding Junhui" ></param><param "hl" name= "en" Original_value= "en" url_escaped_value= "en" js_escaped_value= "en" ></param><param name= "Output" value= " XML "original_value=" xml "url_escaped_value=" xml "js_escaped_value=" xml "></param><param name=" Client " Value= "Google-csbe" original_value= "Google-csbe" url_escaped_value= "Google-csbe" js_escaped_value= "Google-csbe" ></param><param name= "CX" value= "012080660999116631289:zlpj9ypbnii" original_value= " 012080660999116631289:zlpj9ypbnii "url_escaped_value=" 012080660999116631289%3azlpj9ypbnii "js_escaped_value=" 012080660999116631289:zlpj9ypbnii "></param><spelling><suggestion q= "Ding" ><em> ding </em></suggestion></spelling></gsp
>
Extract the <suggestion field.
If there are no spelling suggestions, there is no field, as shown below.
<?xml version= "1.0" encoding= "iso-8859-1" standalone= "no"?> <! DOCTYPE GSP SYSTEM "Google.dtd" > <gsp ver= "3.2" > <error>403</error><tm>0.037868</tm ><Q> Ding </Q> <param name= "Q" value= "Ding" original_value= "%e4%b8%81%e4%bf%8a%e6%99%96" url_escaped_ Value= "%e4%b8%81%e4%bf%8a%e6%99%96" js_escaped_value= "Ding" ></param><param name= "hl" value= "en" Original_value= "en" url_escaped_value= "en" js_escaped_value= "en" ></param><param name= "Output" value= " XML "original_value=" xml "url_escaped_value=" xml "js_escaped_value=" xml "></param><param name=" Client " Value= "Google-csbe" original_value= "Google-csbe" url_escaped_value= "Google-csbe" js_escaped_value= "Google-csbe" ></param><param name= "CX" value= "012080660999116631289:zlpj9ypbnii" original_value= " 012080660999116631289:zlpj9ypbnii "url_escaped_value=" 012080660999116631289%3azlpj9ypbnii "js_escaped_value=" 012080660999116631289:zlpj9ypbnii "></param></gsp>
Time is limited, hurriedly tidy up a bit, have the question please leave a message to discuss.
Reference:
1.http://stackoverflow.com/questions/8428767/how-to-implement-python-spell-checker-using-googles-did-you-mean
2.http://stackoverflow.com/questions/11948945/google-custom-search-api-for-spelling-check
3.https://developers.google.com/custom-search/docs/xml_results?hl=en#wsadvancedsearch
The XML Results for Regular and Advanced Search Queries section of the above page.
4.http://www.google.com/search?q=%e4%b8%81%e5%86%9b%e8%be%89&output=xml&client=google-csbe&cx= 00255077836266642015:u-scht7a-8i
5.http://stackoverflow.com/questions/2087370/decode-html-entities-in-python-string