Urllib Study II

Last Update:2017-12-19 Source: Internet

Author: User

Tags response code urlencode

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Encoding and decoding:

Python2 usage:
    Urllib.urlencode ()   encoding
Urlparse.parse_qs () decoding



Python3 usage:

Urllib.parse.parse_qs () decoding

Role:
    1) Convert dictionary data to URL encoding

    2) Use

        A) encode the URL parameters

        b) Encode the form data on the post


Instance:

python2.x

Import Urllib
Import Urlparse

Def urlencode ():
    params = {' score ': ', ' name ': ' Reptile base ', ' comment ': ' Very good '}
    Urllib.urlencode (params)     # # #编码
    Print (QS)
    Unqs = Urlparse.parse_qs (qs)              # #解码
    Print Unqs
if __name__ = = ' __main__ ':
    UrlEncode ()



python3.x

Import Urllib
Import Urllib.parse

Def urlencode ():
    params = {' score ': ', ' name ': ' Reptile base ', ' comment ': ' Very good '}
    Urllib.parse.urlencode (params)     # # #编码
    Print (QS)
    Urllib.parse.urlparse (QS)               # #解码
    Print (UNQS)
if __name__ = = ' __main__ ':
    UrlEncode ()

Printing results:



python2.x

URLLIB2 can not completely replace urllib?

There is a very important function in the URL urllib.urlencode, this is not in the URLLIB2, so we are generally urllib and urllib2 mixed.

1) Urllib2.urlopen ()

This is also in the urllib, the only thing is to add a timeout parameter

A.url

B.data

C.timeout: Timeout time, such as I set a timeout time of 3 seconds, then I can not connect to the remote server in 3 seconds, it will directly error

3) Error Handling Httperror,e


two important concepts in URLLIB2: openers and handlers

    1.Openers:
When you get a URL you use a opener (a urllib2. Openerdirector instances).
Under normal circumstances, we use the default opener: through Urlopen.
But you can create a personality openers.

2.Handles:
Openers uses processor handlers, all "heavy" work is handled by handlers.
Each handlers knows how to open URLs through a specific protocol, or how to handle various aspects of the URL when it is opened. such as HTTP redirection or HTTP cookies.

Cases:
Import Http.cookiejar
Import Utllib2
def cookies ():
Cookejar = Http.cookiejar.CookieJar ()
Hadler = Urllib.request.HTTPCookieProcessor (Cookiejar=cookejar)
Opener = Urllib.request.build_opener (Hadler,urllib.request.httphandler (debuglevel=1)) # # # # # # # # # #打印调试信息
s = Opener.open ("http://www.douban.com")
Print (S.read (100))
S.close ()
Print (' = ' * 80)
Print (cookejar._cookies)
Print ("=" * 80)
s = Opener.open ("http://www.douban.com")
S.close ()
    
Cookies ()



Urllib2. Request


Instance:
#-*-Coding:utf-8-*-
Import Urllib2

def request ():
# Custom HTTP Headers
headers = {' user-agent ': ' mozilla/5.0 ', ' x-my-header ': ' My Value '}//The custom header in HTTP is usually the beginning of X
req = Urllib2. Request (' http://blog.kamidox.com ', headers=headers)//Create a demand
s = Urllib2.urlopen (req)//Open this request, Urlopen can not only accept a URL as a parameter, but also accept the request as its object
Print (S.read (100))
S.close ()

if __name__ = = ' __main__ ':
Request ()

Run the result, this is the custom HTTP header






Urllib2.bulid-opener

Can let us customize this HTTP behavior

1) Besehandler and its sub-categories

Besehandler is the parent class of all HttpHandler

A.httphandler (processing HTTP requests)

B.httpshandler (Request to process secure links)

C.httpcookieprocessor (Request to process cookies)

2) Bulid-opener

A. Return to the handler list, the handler will be strung up, string up, like our pipeline, when the HTTP request is answered back, it will flow through the handler, so that each handler to deal with different things

B. Return to the Openerdirector, this parameter is very important method is open, this open is to be opened to the remote link to deal with this number

3) handler chain created by default

Handler chain, is actually a handler array, when calling this Urlopener, it will default to the following list, that is, it will default to the list to you to create it.

A.proxyhandler (if proxy is set)

B.unknownhandler (When you don't know what the deal is, it's going to be tuned)

D.httphandler (Request to process HTTP)

C.httpdefaulteorrorhandler (processing the wrong request)

D.httpredirecthandler (when processing a jump, such as HTTP appears 301,302 such a response code)

E.ftphandler (can go to FTP-enabled protocol)

F.filehandler (can support local file open)

G.httpeorrorprocessor (can handle HTTP errors)

E.httpshandler (If an SSL module is installed)


Instance:
#-*-Coding:utf-8-*-
Import Urllib2
Import Urllib

Def request_post_debug ():
# POST
data = {' username ': ' kamidox ', ' Password ': ' xxxxxxxx '} # #数据体
# headers = {' user-agent ': ' mozilla/5.0 ', ' content-type ': ' Plain/text '}
headers = {' user-agent ': ' mozilla/5.0 '} # # #定制的头
req = Urllib2. Requests (' http://www.douban.com ', Data=urllib.urlencode (data), headers=headers)//Create a request, this request is sent to the watercress
Opener = Urllib2.build_opener (urllib2. HttpHandler (debuglevel=1) # #创建一个open打开器, if you do not pass the parameters, it is to send you the system default handler, if we pass the parameters to it, if the system is the default it will go to replace, if the system does not have it to go to add
s = Opener.open (req) # # #用这个open去打开这个请求
Print (S.read) # # #打印前面100个字节
S.close ()

if __name__ = = ' __main__ ':
Request_post_debug ()


Example 2:

If I create a opener, what does the function behind me want to do with it? How to save this opener?

Save Opener as Default

1.urllib2.install_opener (we can create the processing of the opener, save it to urllib2 this library, call Urllib2, will go directly to download the installation of this opener)

2. Example: Install_debug_opener

Example: Request_post_debug
1. Print HTTP Debugging information
2.POST data
#-*-Coding:utf-8-*-
Import Urllib2

def request ():
# Custom HTTP Headers
headers = {' user-agent ': ' mozilla/5.0 ', ' x-my-header ': ' My Value '}
req = Urllib2. Request (' http://blog.kamidox.com ', headers=headers)
s = Urllib2.urlopen (req)
Print (S.read (100))
Print (req.headers)
S.close ()

Def request_post_debug ():
# POST
data = {' username ': ' kamidox ', ' Password ': ' xxxxxxxx '}
# headers = {' user-agent ': ' mozilla/5.0 ', ' content-type ': ' Plain/text '}
headers = {' user-agent ': ' mozilla/5.0 '}
req = Urllib2. Request (' http://www.douban.com ', Data=urllib.urlencode (data), headers=headers)
Opener = Urllib2.build_opener (urllib2. HttpHandler (debuglevel=1))
s = Opener.open (req)
Print (S.read (100))
S.close ()

Def install_debug_handler ():
Opener = Urllib2.build_opener (urllib2. HttpHandler (debuglevel=1),
Urllib2. Httpshandler (debuglevel=1))
# #这里能处理HTTP协议和HTTPS协议

Urllib2.install_opener (opener) # #将Handler安装到系统默认区, to open the opener we're going to install here.

if __name__ = = ' __main__ ':
Install_debug_handler ()
Request ()



Cookies

1) cookieillib. Cookiejar

Provides an interface for parsing and saving cookies, because cookies have a life cycle and many parameters, and this class provides the processing of these cookies.

2) Httpcookieprocessor

Provides the ability to automatically process cookies, and its parent class is also besehandler, so we can string up this cookie so that we can process some information.



Example: Handle_cookies


#-*-Coding:utf-8-*-
Import Urllib2

Def handle_cookie ()://First define a cookie processing information
Cookiejar = Cookielib. Cookiejar () # #先串联一个CookieJar的对象
Handler = Urllib2. Httpcookieprocessor (Cookiejar=cookiejar) # #创建一个HTTPCookieProcessor的对象, pass in a parameter Cookiejar go in

# #还需要创建一个新的handler, print out its debugging information
Opener = Urllib2.build_opener (handler, Urllib2. HttpHandler (debuglevel=1))
s = Opener.open (' http://www.douban.com ')
Print (S.read (100))
S.close ()
if __name__ = = ' __main__ ':
Handle_cookie ()

Run the result, there is a set_cookie in this answer, there is a bid



After receiving this request, our Cookiejar, which contains the cookies returned by these servers, we can print it out. The code is as follows:

#-*-Coding:utf-8-*-
Import Urllib2

Def Handle_cookie (): # #先定义一个处理cookie的信息
Cookiejar = Cookielib. Cookiejar () # #先串联一个CookieJar的对象
Handler = Urllib2. Httpcookieprocessor (Cookiejar=cookiejar) # #创建一个HTTPCookieProcessor的对象, pass in a parameter Cookiejar go in
Opener = Urllib2.build_opener (handler, Urllib2. HttpHandler (debuglevel=1) # # #还需要创建一个新的handler, print out its debug information
s = Opener.open (' http://www.douban.com ')
Print (S.read (100))
S.close ()
Print (' = ' * 80)
Print (cookiejar._cookies) # # #这个属性就是服务器所有的cookie
Print (' = ' * 80)

if __name__ = = ' __main__ ':
Handle_cookie ()

Run the results, we can see that this is the print out of the cookie.
Our opener is actually carrying these cookie information, and it will send the cookie back to me the next time I send a request to the past.

Urllib Study II

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More