Python3 Urllib Detailed Usage method (header, proxy, timeout, authentication, exception handling)

Source: Internet
Author: User
Tags http authentication urlencode

Urllib is a Python get URL (Uniform Resource locators, unified resource addressable device), we can use it to crawl remote data to save Oh, here are some about the urllib use some about the header, agent, Timeout, authentication, exception handling methods, let's take a look below.

N methods of crawling Web resources Python3

1, the simplest

Import= Urllib.request.urlopen ('http://python.org/'= Response.read ()

2. Use Request

Import= urllib.request.Request ('http://python.org/'=  = Response.read ()

3. Send data

#!/usr/bin/env Python3ImportUrllib.parseImportUrllib.requesturl='http://localhost/login.php'user_agent='mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'Values= {'Act':'Login','Login[email]':'[email protected]','Login[password]':'123456'}data=Urllib.parse.urlencode (values) Req=urllib.request.Request (URL, data) Req.add_header ('Referer','http://www.python.org/') Response=Urllib.request.urlopen (req) the_page=Response.read ()Print(The_page.decode ("UTF8"))

4. Send data and headers

#!/usr/bin/env Python3ImportUrllib.parseImportUrllib.requesturl='http://localhost/login.php'user_agent='mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'Values= {'Act':'Login','Login[email]':'[email protected]','Login[password]':'123456'}headers= {'user-agent': User_agent}data=Urllib.parse.urlencode (values) Req=urllib.request.Request (URL, data, headers) Response=Urllib.request.urlopen (req) the_page=Response.read ()Print(The_page.decode ("UTF8"))

5. HTTP Error

# !/usr/bin/env Python3 Import  = urllib.request.Request (")try: Urllib.request.urlopen (req)except  urllib.error.HTTPError as e:print( E.code)print(E.read (). Decode ("UTF8"))

6. Exception Handling 1

#!/usr/bin/env Python3 fromUrllib.requestImportRequest, Urlopen fromUrllib.errorImportUrlerror, Httperrorreq= Request ("http://www.111cn.net/")Try: Response=Urlopen (req)exceptHttperror as E:Print('The server couldn'T fulfill the request.')Print('Error Code:', E.code)exceptUrlerror as E:Print('We failed to reach a server.')Print('Reason:', E.reason)Else:Print("good!")Print(Response.read (). Decode ("UTF8"))

7. Exception Handling 2

#!/usr/bin/env Python3 fromUrllib.requestImportRequest, Urlopen fromUrllib.errorImportUrlerrorreq= Request ("http://www.111cn.net/")Try: Response=Urlopen (req)exceptUrlerror as E:ifHasattr (E,'reason'):Print('We failed to reach a server.')Print('Reason:', E.reason)elifHasattr (E,'Code'):Print('The server couldn'T fulfill the request.')Print('Error Code:', E.code)Else:Print("good!")Print(Response.read (). Decode ("UTF8"))

8. HTTP Authentication

#!/usr/bin/env Python3Importurllib.request#Create a password managerPassword_mgr =Urllib.request.HTTPPasswordMgrWithDefaultRealm ()#ADD the username and password.#If We knew the realm, we could use it instead of None.Top_level_url ="https://www.111cn.net/"Password_mgr.add_password (None, Top_level_url,'Rekfan','xxxxxx') Handler=Urllib.request.HTTPBasicAuthHandler (password_mgr)#Create "opener" (Openerdirector instance)Opener =Urllib.request.build_opener (handler)#Use the opener to fetch a URLA_url ="https://www.111cn.net/"x=Opener.open (A_url)Print(X.read ())#Install the opener.#Now all calls to Urllib.request.urlopen with our opener.Urllib.request.install_opener (opener) a= Urllib.request.urlopen (A_url). Read (). Decode ('UTF8')Print(a)

9, the use of agents

#!/usr/bin/env Python3ImportUrllib.requestproxy_support= Urllib.request.ProxyHandler ({'Sock5':'localhost:1080'}) Opener=Urllib.request.build_opener (Proxy_support) Urllib.request.install_opener (opener) a= Urllib.request.urlopen ("http://www.111cn.net"). Read (). Decode ("UTF8")Print(a)

10. Timeout

#!/usr/bin/env Python3ImportSocketImporturllib.request#Timeout in secondsTimeout = 2socket.setdefaulttimeout (Timeout)#This call to Urllib.request.urlopen now uses the default timeout#We have a set in the socket modulereq = Urllib.request.Request ('http://www.111cn.net/') A=Urllib.request.urlopen (req). Read ()Print(a)

Python3 Urllib Detailed Usage method (header, proxy, timeout, authentication, exception handling)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.