This article mainly introduces the usage of the urllib module in python, and analyzes in detail the curl operation method of replacing PHP with the urllib module in python in the form of an instance, which has good reference value, for more information about the urllib module usage in python, see the following example. Share it with you for your reference. The specific analysis is as follows:
I. problems:
Recently, the company's project needs to regularly obtain data based on the api provided by the customer. the previous solution was to use php to collect tasks and store them in the redis queue, then run a php file in a resident process in linux. the php file has an infinite loop, and the redis queue is judged to be executed. if there is no PHP file, break is executed.
II. solution:
I just learned python recently. python's urllib module may be faster than php's curl, and it's simple. paste the code
The code is as follows:
# _ * _ Coding: UTF-8 _*_
Import sys
Reload (sys)
Sys. setdefaultencoding ("UTF-8 ")
Import OS
Import json
From urllib import urlopen
Doc = urlopen ("http: // xxxx? Webid = 1 & tid = 901 & cateid = 101 "). read ()
Doc = json. loads (doc)
Print doc
Print doc. keys ()
Print doc ["msg"]
Print doc ['data']
Print doc ['ret ']
The time required for the first access is [Finished in 3.0 s].
The second access time is [Finished in 0.2 s].
It can be seen that the urllib module of python is cached.
A typical example of urllib/2 usage
The code is as follows:
Import urllib2
Import cookielib
Import urllib
Class Hi_login:
Def _ init _ (self ):
Cookie = cookielib. CookieJar ()
Self. cookie = urllib2.HTTPCookieProcessor (cookie) ##### generate cookie ###
Def login (self, user, pwd ):
Url = 'http: // passport.baidu.com /? Login'
Postdata = urllib. urlencode ({
'Mem _ pass': 'on ',
'Password': pwd
'Submit ':'',
'Tpl ': 'sp ',
'TP _ reg ': 'sp ',
'U': 'http: // hi.baidu.com ',
'Username': user })
### Proxy_support = urllib2.ProxyHandler ({"http": "http: // ahad-haam: 3128"}) and then add it to the opener method ####
Opener = urllib2.build _ opener (self. cookie) ### use cookie ###
Headers = {####### dict structure, which can be added to x-forward-for or refer #######
'User-Agent': 'mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv: 1.9.1.6) Gecko/20091201 Firefox/3.5.6 '}
Urllib2.install _ opener (opener)
Request = urllib2.Request (url, urllib. urlencode (postdata), headers = headers)
Urllib2.urlopen (request)
If _ name __= = '_ main __':
Pwd = '000000'
User = 'xiafu'
Test = Hi_login ()
Test. login (user, pwd)
If you access a page that requires authentication, such as the nagios monitoring page,
The code is as follows:
Import urllib2
Password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm ()
Url = "http://202.1.x.y/nagios"
Password_mgr.add_password (None, url, user = 'abc', passwd = 'xxxxxx ')
Handler = urllib2.HTTPBasicAuthHandler (password_mgr)
Opener = urllib2.build _ opener (handler)
Urllib2.install _ opener (opener)
F = urllib2.urlopen (url)
Print f. code
The error 200 is returned.
I hope this article will help you with Python programming.