Environment: python2.7
Take 360 as an example, use the HTTP intercept tool to obtain the URL, the specific method of obtaining the function depends on the requirements. For example: I would like to crawl her keywords, is to intercept to ... word= end of a string of URLs.
Did not add browser information, system version, it turns out that 360 is very friendly to reptiles =, =.
1, on the treatment of regular expressions, according to the actual situation of their own writing, there is no special uniform format.
2, about the website's code, can modify the processing, here uses the GBK.
1 #CODING=GBK2 " "3 Created on 2014-9-234 5 @author: Administrator6 " "7 ImportUrllib28 ImportUrllib9 ImportReTen One AKey = Urllib.quote (Raw_input ("Input Keywords:")) -URL ='http://sug.so.360.cn/suggest?callback=suggest_so&encodein=gbk&encodeout=gbk&format=json& fields=word,obdata&word='+Key -req =Urllib2. Request (URL) the - -HTML =Urllib2.urlopen (req). Read () - + -SS = Re.findall ("\:\"(.*?) \"", HTML) + A forIinchSS: at PrintI
A simple solution for crawling search engine keywords with python