Urllib Study II

Source: Internet
Author: User
Tags response code urlencode

Encoding and decoding:

Python2 usage:
Urllib.urlencode () encoding
Urlparse.parse_qs () decoding


Python3 usage:
Urllib.parse.parse_qs () decoding
Role:
1) Convert dictionary data to URL encoding

2) Use

A) encode the URL parameters

b) Encode the form data on the post


Instance:
python2.x

Import Urllib
Import Urlparse

Def urlencode ():
params = {' score ': ', ' name ': ' Reptile base ', ' comment ': ' Very good '}
Urllib.urlencode (params) # # #编码
Print (QS)
Unqs = Urlparse.parse_qs (qs) # #解码
Print Unqs
if __name__ = = ' __main__ ':
UrlEncode ()


python3.x
  
Import Urllib
Import Urllib.parse

Def urlencode ():
params = {' score ': ', ' name ': ' Reptile base ', ' comment ': ' Very good '}
Urllib.parse.urlencode (params) # # #编码
Print (QS)
Urllib.parse.urlparse (QS) # #解码
Print (UNQS)
if __name__ = = ' __main__ ':
UrlEncode ()

Printing results:



python2.x

URLLIB2 can not completely replace urllib?

There is a very important function in the URL urllib.urlencode, this is not in the URLLIB2, so we are generally urllib and urllib2 mixed.

1) Urllib2.urlopen ()

This is also in the urllib, the only thing is to add a timeout parameter

A.url

B.data

C.timeout: Timeout time, such as I set a timeout time of 3 seconds, then I can not connect to the remote server in 3 seconds, it will directly error

3) Error Handling Httperror,e


two important concepts in URLLIB2: openers and handlers

1.Openers:
When you get a URL you use a opener (a urllib2. Openerdirector instances).
Under normal circumstances, we use the default opener: through Urlopen.
But you can create a personality openers.

2.Handles:
Openers uses processor handlers, all "heavy" work is handled by handlers.
Each handlers knows how to open URLs through a specific protocol, or how to handle various aspects of the URL when it is opened. such as HTTP redirection or HTTP cookies.

Cases:
Import Http.cookiejar
Import Utllib2
def cookies ():
Cookejar = Http.cookiejar.CookieJar ()
Hadler = Urllib.request.HTTPCookieProcessor (Cookiejar=cookejar)
Opener = Urllib.request.build_opener (Hadler,urllib.request.httphandler (debuglevel=1)) # # # # # # # # # #打印调试信息
s = Opener.open ("http://www.douban.com")
Print (S.read (100))
S.close ()
Print (' = ' * 80)
Print (cookejar._cookies)
Print ("=" * 80)
s = Opener.open ("http://www.douban.com")
S.close ()

Cookies ()



Urllib2. Request


Instance:
#-*-Coding:utf-8-*-
Import Urllib2

def request ():
# Custom HTTP Headers
headers = {' user-agent ': ' mozilla/5.0 ', ' x-my-header ': ' My Value '}//The custom header in HTTP is usually the beginning of X
req = Urllib2. Request (' http://blog.kamidox.com ', headers=headers)//Create a demand
s = Urllib2.urlopen (req)//Open this request, Urlopen can not only accept a URL as a parameter, but also accept the request as its object
Print (S.read (100))
S.close ()

if __name__ = = ' __main__ ':
Request ()

Run the result, this is the custom HTTP header






Urllib2.bulid-opener

Can let us customize this HTTP behavior

1) Besehandler and its sub-categories

Besehandler is the parent class of all HttpHandler

A.httphandler (processing HTTP requests)

B.httpshandler (Request to process secure links)

C.httpcookieprocessor (Request to process cookies)

2) Bulid-opener

A. Return to the handler list, the handler will be strung up, string up, like our pipeline, when the HTTP request is answered back, it will flow through the handler, so that each handler to deal with different things

B. Return to the Openerdirector, this parameter is very important method is open, this open is to be opened to the remote link to deal with this number

3) handler chain created by default

Handler chain, is actually a handler array, when calling this Urlopener, it will default to the following list, that is, it will default to the list to you to create it.

A.proxyhandler (if proxy is set)

B.unknownhandler (When you don't know what the deal is, it's going to be tuned)

D.httphandler (Request to process HTTP)

C.httpdefaulteorrorhandler (processing the wrong request)

D.httpredirecthandler (when processing a jump, such as HTTP appears 301,302 such a response code)

E.ftphandler (can go to FTP-enabled protocol)

F.filehandler (can support local file open)

G.httpeorrorprocessor (can handle HTTP errors)

E.httpshandler (If an SSL module is installed)


Instance:
#-*-Coding:utf-8-*-
Import Urllib2
Import Urllib

Def request_post_debug ():
# POST
data = {' username ': ' kamidox ', ' Password ': ' xxxxxxxx '} # #数据体
# headers = {' user-agent ': ' mozilla/5.0 ', ' content-type ': ' Plain/text '}
headers = {' user-agent ': ' mozilla/5.0 '} # # #定制的头
req = Urllib2. Requests (' http://www.douban.com ', Data=urllib.urlencode (data), headers=headers)//Create a request, this request is sent to the watercress
Opener = Urllib2.build_opener (urllib2. HttpHandler (debuglevel=1) # #创建一个open打开器, if you do not pass the parameters, it is to send you the system default handler, if we pass the parameters to it, if the system is the default it will go to replace, if the system does not have it to go to add
s = Opener.open (req) # # #用这个open去打开这个请求
Print (S.read) # # #打印前面100个字节
S.close ()

if __name__ = = ' __main__ ':
Request_post_debug ()


Example 2:

If I create a opener, what does the function behind me want to do with it? How to save this opener?

Save Opener as Default

1.urllib2.install_opener (we can create the processing of the opener, save it to urllib2 this library, call Urllib2, will go directly to download the installation of this opener)

2. Example: Install_debug_opener

Example: Request_post_debug
1. Print HTTP Debugging information
2.POST data
#-*-Coding:utf-8-*-
Import Urllib2

def request ():
# Custom HTTP Headers
headers = {' user-agent ': ' mozilla/5.0 ', ' x-my-header ': ' My Value '}
req = Urllib2. Request (' http://blog.kamidox.com ', headers=headers)
s = Urllib2.urlopen (req)
Print (S.read (100))
Print (req.headers)
S.close ()

Def request_post_debug ():
# POST
data = {' username ': ' kamidox ', ' Password ': ' xxxxxxxx '}
# headers = {' user-agent ': ' mozilla/5.0 ', ' content-type ': ' Plain/text '}
headers = {' user-agent ': ' mozilla/5.0 '}
req = Urllib2. Request (' http://www.douban.com ', Data=urllib.urlencode (data), headers=headers)
Opener = Urllib2.build_opener (urllib2. HttpHandler (debuglevel=1))
s = Opener.open (req)
Print (S.read (100))
S.close ()

Def install_debug_handler ():
Opener = Urllib2.build_opener (urllib2. HttpHandler (debuglevel=1),
Urllib2. Httpshandler (debuglevel=1))
# #这里能处理HTTP协议和HTTPS协议

Urllib2.install_opener (opener) # #将Handler安装到系统默认区, to open the opener we're going to install here.

if __name__ = = ' __main__ ':
Install_debug_handler ()
Request ()



Cookies

1) cookieillib. Cookiejar

Provides an interface for parsing and saving cookies, because cookies have a life cycle and many parameters, and this class provides the processing of these cookies.

2) Httpcookieprocessor

Provides the ability to automatically process cookies, and its parent class is also besehandler, so we can string up this cookie so that we can process some information.



Example: Handle_cookies


#-*-Coding:utf-8-*-
Import Urllib2

Def handle_cookie ()://First define a cookie processing information
Cookiejar = Cookielib. Cookiejar () # #先串联一个CookieJar的对象
Handler = Urllib2. Httpcookieprocessor (Cookiejar=cookiejar) # #创建一个HTTPCookieProcessor的对象, pass in a parameter Cookiejar go in

# #还需要创建一个新的handler, print out its debugging information
Opener = Urllib2.build_opener (handler, Urllib2. HttpHandler (debuglevel=1))
s = Opener.open (' http://www.douban.com ')
Print (S.read (100))
S.close ()
if __name__ = = ' __main__ ':
Handle_cookie ()

Run the result, there is a set_cookie in this answer, there is a bid



After receiving this request, our Cookiejar, which contains the cookies returned by these servers, we can print it out. The code is as follows:

#-*-Coding:utf-8-*-
Import Urllib2

Def Handle_cookie (): # #先定义一个处理cookie的信息
Cookiejar = Cookielib. Cookiejar () # #先串联一个CookieJar的对象
Handler = Urllib2. Httpcookieprocessor (Cookiejar=cookiejar) # #创建一个HTTPCookieProcessor的对象, pass in a parameter Cookiejar go in
Opener = Urllib2.build_opener (handler, Urllib2. HttpHandler (debuglevel=1) # # #还需要创建一个新的handler, print out its debug information
s = Opener.open (' http://www.douban.com ')
Print (S.read (100))
S.close ()
Print (' = ' * 80)
Print (cookiejar._cookies) # # #这个属性就是服务器所有的cookie
Print (' = ' * 80)

if __name__ = = ' __main__ ':
Handle_cookie ()

Run the results, we can see that this is the print out of the cookie.
Our opener is actually carrying these cookie information, and it will send the cookie back to me the next time I send a request to the past.

Urllib Study II

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.