International - English

Cart Console

Topic Center

Contact Sales

Home > Developer > Python

Python Personal Learning Note four

Last Update:2014-09-22 Source: Internet

Author: User

Tags rfc

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This section focuses on network-related knowledge in the Python language.

One

The main files and directories are below the request.py module in Urllib. which supports SSL encryption access.

Let's take a look at the main classes and functions.

Let's look at the source code first.

def urlopen (URL, Data=none, timeout=socket._global_default_timeout,            *, Cafile=none, Capath=none, Cadefault=False ):    Global _opener    if Cafile or Capath or Cadefault:        if not _have_ssl:            raise ValueError (' SSL + not AV Ailable ')        context = Ssl._create_stdlib_context (Cert_reqs=ssl. cert_required,                                             cafile=cafile,                                             capath=capath)        Https_handler = Httpshandler (Context=context, Check_ Hostname=true)        opener = Build_opener (https_handler)    elif _opener is None:        _opener = opener = Build_opener ()    Else:        opener = _opener    return opener.open (URL, data, timeout)

Direct use of the Urlopen function for Web Access, the main key to pass is the URL of specific URLs

Import urllib.requestif __name__ = = ' __main__ ':    print (' main Thread Run: ', __name__)    responsedata = Urllib.request.urlopen (' http://www.baidu.com/robots.txt ')    strdata = Responsedata.read ()    strshow = Strdata.decode (' utf-8 ')    if (false):        print (Responsedata.geturl ())    if (false):        print ( Responsedata.info ())    else:        print (responsedata.__sizeof__ ())        print (strshow)    Responsedata.close ()    print (' \nmain Thread Exit: ', __name__)

//Note the above code, using the UTF-8 encoding method, so the corresponding decoding is the same.

The results are as follows

Main Thread Run: __main__32user-agent:baiduspiderdisallow:/baidudisallow:/s? Disallow:/ulink? Disallow:/link? User-agent:googlebotdisallow:/baidudisallow:/s? Disallow:/shifen/disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent:msnbotdisallow:/baidudisallow:/s? Disallow:/shifen/disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent:baiduspider-imagedisallow:/baidudisallow:/s? Disallow:/shifen/disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent:youdaobotdisallow:/baidudisallow:/s? Disallow:/shifen/disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent:sogou Web Spiderdisallow:/baidudisallow:/s? Disallow:/shifen/disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent:sogou Inst Spiderdisallow:/baidudisallow:/s? Disallow:/shifen/disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent:sogou Spider2disallow:/baidudisallow:/s? Disallow:/shifen/disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent:sogou blogdisallow:/baidudisallow:/s? Disallow:/shifen/disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent:sogou News spiderdisallow:/baidudisallow:/s? Disallow:/shifen/disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent:sogou Orion spiderdisallow:/baidudisallow:/s? Disallow:/shifen/disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent:chinasospiderdisallow:/baidudisallow:/s? Disallow:/shifen/disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent:sosospiderdisallow:/baidudisallow:/s? Disallow:/shifen/disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent:yisouspiderdisallow:/baidudisallow:/s? Disallow:/shifen/disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent:easouspiderdisallow:/baidudisallow:/s? Disallow:/shifen/Disallow:/homepage/disallow:/cprodisallow:/ulink? Disallow:/link? User-agent: *disallow:/main Thread Exit: __main__

Two

The function Urlretrieve can implement a direct delivery URL address to read the Web page content and store it as a local file.

The function return value is a list which includes two parameters, the first is the local storage file name, and the second is the Web service

The HTTP response header returned

def urlretrieve (URL, Filename=none, Reporthook=none, Data=none): "" "    Retrieve a URL to a temporary location on D Isk.

Code testing

Import urllib.requestif __name__ = = ' __main__ ':    print (' main Thread Run: ', __name__)    data = Urllib.request.urlretrieve (' Http://www.baidu.com/robots.txt ', ' robots.txt ')    print ('--filename--: ', data[0])    print ('--response--: ', data[1])    print (' \nmain Thread Exit: ', __name__)

Results:

Main Thread Run: __main__--filename--: robots.txt--response--: Date:mon, Sep 08:08:05 gmtserver:apachep3p:cp= " OTI DSP COR IVA our IND COM "set-cookie:baiduid=4fb847bee916a0f72abc5093271cd2bc:fg=1; Expires=tue, 22-sep-15 08:08:05 GMT; max-age=31536000; path=/; domain=.baidu.com; Version=1last-modified:thu, 07:10:38 gmtetag: "91e-4fe5e56791780" Accept-ranges:bytescontent-length: 2334vary:accept-encoding,user-agentconnection:closecontent-type:text/plainmain Thread Exit: __main__

Three

The function request_host resolves the host address contained in the URL only one Request object instance is passed in the parameter

The request object will be introduced later.

Here's a look at the function source code

def request_host (Request): "" "    Return Request-host, as defined by RFC 2965.    Variation from rfc:returned value was lowercased, for convenient    comparison.    "" " url = request.full_url    host = urlparse (URL) [1]    if host = = "":        host = Request.get_header ("host", "")    # Remove port, if present    host = _cut_port_re.sub ("", host, 1)    return Host.lower ()

Test Code：

Import urllib.requestif __name__ = = ' __main__ ':    print (' main Thread Run: ', __name__)    Req = Urllib.request.Request (' Http://www.baidu.com/robots.txt ')    host = Urllib.request.request_host (REQ)    print ( Host)    print (' \nmain Thread Exit: ', __name__)

Results:

Main thread Run: __main__www.baidu.commain thread Exit: __main__

Four

The module main class request class is described below. Look, it's a capital R.

First look at the source code

Class Request:def __init__ (self, URL, Data=none, headers={}, Origin_req_host=none, Unverifiable=false,        Method=none): Self.full_url = URL self.headers = {} Self.unredirected_hdrs = {}            Self._data = None Self.data = Data Self._tunnel_host = None for key, value in Headers.items ():        Self.add_header (key, value) if origin_req_host is None:origin_req_host = Request_host (self) Self.origin_req_host = Origin_req_host self.unverifiable = unverifiable if Method:self.metho D = Method @property def full_url (self): if Self.fragment:return ' {}#{} '. Format (Self._full_url, S elf.fragment) return Self._full_url @full_url. Setter def full_url (self, URL): # Unwrap (' <url:type:/ /host/path> ')--' type://host/path ' Self._full_url = Unwrap (URL) self._full_url, self.fragment = spli    Ttag (Self._full_url)    Self._parse () @full_url. Deleter def full_url (self): Self._full_url = None Self.fragment = None        Self.selector = "@property def data: Return Self._data @data. Setter def data (self, data): if data! = Self._data:self._data = Data # issue 16464 # If we change data we need To remove Content-length header # (cause it's most probably calculated for previous value) if self.h        As_header ("Content-length"): Self.remove_header ("Content-length") @data. Deleter def data (self):            Self.data = None def _parse (self): self.type, rest = SplitType (Self._full_url) If Self.type is none:        Raise ValueError ("Unknown URL type:%r"% self.full_url) self.host, Self.selector = Splithost (rest)  If Self.host:self.host = Unquote (self.host) def get_method (self): "" "Return a String indicating the HTTP request method. ""        Default_method = "POST" If Self.data is not None of else "GET" return getattr (Self, ' method ', Default_method) def get_full_url (self): return Self.full_url def set_proxy (self, host, type): if Self.type = = ' https ' And not self._tunnel_host:self._tunnel_host = self.host else:self.type= Type sel    F.selector = Self.full_url Self.host = host def has_proxy (self): return self.selector = = Self.full_url  def add_header (self, Key, Val): # Useful for something like authentication self.headers[key.capitalize ()] = Val def add_unredirected_header (self, Key, Val): # won't is added to a redirected request Self.unredi                Rected_hdrs[key.capitalize ()] = Val def has_header (self, Header_name): Return (Header_name in Self.headers or Header_name in Self.unredirected_hdrs) def get_header (self, Header_name, Default=none): Return to self           . Headers.get ( Header_name, Self.unredirected_hdrs.get (header_name, default)) def remove_header (self, header_name):        Self.headers.pop (Header_name, none) Self.unredirected_hdrs.pop (Header_name, none) def header_items (self): HDRs = Self.unredirected_hdrs.copy () hdrs.update (self.headers) return list (Hdrs.items ())

the initial constructor function of the class

def __init__ (self, URL, Data=none, headers={}, Origin_req_host=none, Unverifiable=false, M Ethod=none):

Note that several key parameter URLs represent the URL address you want to access, and data represents the post you want to send.

Headers represents the header information field that you need to include in the HTTP request header

Method represents the use of Get or post methods.

The default is post delivery mode

<span style= "FONT-SIZE:12PX;" >req = urllib.request.Request (' http://www.baidu.com/robots.txt ') </span>

to create an object instance of the request class

For example, add a user-agent header to the field headers

User_agent = {' user-agent ': ' mozilla/5.0 (Windows; U Windows NT 5.1; En-us; rv:1.9.1.6) gecko/20091201 firefox/3.5.6 '}req= urllib.request.Request (url= ' Http://www.baidu.com/robots.txt ', Headers=user_agent)

Modify the time-out period

Import Socket Socket.setdefaulttimeout (TEN) #10s

Five

The following describes the use of proxies

Proxy configuration and related address information must be made before invoking the Web Access service.

The following code examples are used:

Import Socket Import urllib.requestsocket.setdefaulttimeout  # 10sif __name__ = = ' __main__ ':    print (' main Thread Run: ', __name__)    proxy = Urllib.request.ProxyHandler ({' http ': ' http://www.baidu.com:8080 '})    opener = Urllib.request.build_opener (proxy, Urllib.request.HTTPHandler)    Urllib.request.install_opener (opener)    Content = Urllib.request.urlopen (' Http://www.baidu.com/robots.txt '). Read ()    print (' \nmain Thread Exit: ', __name_ _)

VI: Error exception handling

Python's Network service exception handles correlation functions and uses.

Mainly the use of try and EXECPT statement blocks. Remember the important point

Python's exception-handling statement, preferably a line of code that throws a catch

Example:

    Try:        Requrl = urllib.request.Request (url= ' http://www.baidu.com/robots.txt ', headers=user_agent)    except Httperror:        print (' Urllib.error.HTTPError ')    except Urlerror:        print (' Urllib.error.URLError ')    Except OSError:        print (' urllib.error.OSError ')    try:        responsedata = Urllib.request.urlopen (Requrl)    except Httperror:        print (' Urllib.error.HTTPError ')    except Urlerror:        responsedata.close ()        print (' Urllib.error.URLError ')    except OSError:        print (' urllib.error.OSError ')    try:        Pagedata = Responsedata.read ()    except Httperror:        responsedata.close ()        print (' Urllib.error.HTTPError ')    except Urlerror:        responsedata.close ()        print (' Urllib.error.URLError ')    except OSError:        print (' Urllib.error.OSError ')    print (pagedata)    responsedata.close ()

Seven notes

These are probably some of the basic Web Access service functions and classes used, and there are many ways and functions that can do the same.

Calls are made according to individual wishes and needs. There is a remember I just basic learning, sorting notes in this, convenient rookie and his later review.

Python Personal Learning Note four

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

python machine learning library book machine learning python q learning python coursera machine learning python udacity python machine learning reinforcement learning python tutorial python machine learning bootcamp

The difference between OS and sys two modules in Python 04-05

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python Personal Learning Note four

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support