Environment: Python2.7.9/sublime Text 2/chrome
1.url access, call the Urllib library function directly
Import urllib2url='http://www.baidu.com/'= urllib2.urlopen (URL) HTML =response.read ()print html
2. Access with parameters, with the Baidu search function as an example
Use Chrome browser to access the effect, chrome search engine set to Baidu, the address bar input test, the effect is as follows:
You can see the URL of Baidu search is https://www.baidu.com/s?ie=UTF-8&wd=test
Modify code to increase access parameters
#Coding=utf-8ImportUrllibImportUrllib2#URL addressUrl='https://www.baidu.com/s'#Parametersvalues={ 'IE':'UTF-8', 'WD':'Test' }#for parametric encapsulationData=Urllib.urlencode (values)#assemble the full URLreq=Urllib2. Request (Url,data)#access the full URLResponse =Urllib2.urlopen (req) HTML=Response.read ()PrintHtml
Run the code and get the result
The prompt to access the page does not exist, this time need to consider the way to access the problem. Urllib2. Request (Url,data) access mode is post, you need to use Get method to try, change the code to
#Coding=utf-8ImportUrllibImportUrllib2#URL addressUrl='https://www.baidu.com/s'#Parametersvalues={ 'IE':'UTF-8', 'WD':'Test' }#for parametric encapsulationData=Urllib.urlencode (values)#assemble the full URL#Req=urllib2. Request (Url,data)url=url+'?'+Data#access the full URL#response = Urllib2.urlopen (req)Response =urllib2.urlopen (URL) HTML=Response.read ()PrintHtml
Run again to get the result
HTTPS has been redirected and needs to use HTTP
#Coding=utf-8ImportUrllibImportUrllib2#URL address#url= ' https://www.baidu.com/s 'Url='http://www.baidu.com/s'#Parametersvalues={ 'IE':'UTF-8', 'WD':'Test' }#for parametric encapsulationData=Urllib.urlencode (values)#assemble the full URL#Req=urllib2. Request (Url,data)url=url+'?'+Data#access the full URL#response = Urllib2.urlopen (req)Response =urllib2.urlopen (URL) HTML=Response.read ()PrintHtml
Run again to achieve normal access
Python web crawler (1)--url asked about parameter settings