17.3.12--URLLIB2 Module

Source: Internet
Author: User
Tags http authentication

1---urllib2 is a very powerful Python network resource access module that functions like a urllib module

The Urllib2 module in the Python standard library can be said to be an upgraded, complex version of the Urlib module that does not need to be downloaded separately.

For example, access to network resources requires HTTP authentication,

                 cookie information is required ,

                 Like a normal browser to access the network, Web resources

This time with URLLIB2

2---URLLIB2 module introduction

1) Set the timeout time-out setting:

Import Urllib2

Test=urllib2.urlopen (' http://www.iplaypy.com/', timeout=15)

#2个参数, one is the URL address, one is time-out, this time the test set value is 15

2) Added header header information when accessing

header={"user-agent": "mozilla-firefox24.0"} #字典类型

Urllib2.urlopen (Url,header)

Like the above operation, you can add header headers to mimic browser behavior, to deal with some network resources to prohibit crawlers, very practical

3) Get HTTP page status code with URLLIB2

Import Urllib2

Test=urllib2.urlopen ("http://baidu.com/")

Test.code

So you can access Baidu's page status code, 200 proof to access, get Web content

4) Use URLLIB2 to process cookies

Import Urllib2

Import Cookielib

Cookie=cookielib. Cookiejar () #后面函数方法要注意C和J是大写的

Opener=urllib2.build_opener (URLLIB2. Httpcookieprocessor (Cookie))

Response=opener.open (' http://www.baidu.com ')

For item in Cookie:

If item.name== "Some--cookie_item_name"

Print Item.value

5) Urlopen ()----is the processing entry function, gets the Openerdiretor object, calls Opener.open ()

The default Opendiretor object is stored in the variable _open, using singleton mode

Build_opener ()----

Install_opener ()---Save the Openerdirector object in the variable _opener as the default opener use

Class Openerdirector

Class Resquest---as an information object, saving and URL-related parameters, including headers,data,proxy, for URL parameter passing

Class HttpHandler---inheritors relationship: basehandler-->abstrachttphandler-->httphandler

Call Httplib. Httpconnection completion of HTTP processing

17.3.12--URLLIB2 Module

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.