Python3中使用urllib的方法詳解(header,代理,逾時,認證,異常處理),python3urllib

來源:互聯網
上載者:User

Python3中使用urllib的方法詳解(header,代理,逾時,認證,異常處理),python3urllib

我們可以利用urllib來抓取遠端資料進行儲存哦,以下是python3 抓取網頁資源的多種方法,有需要的可以參考借鑒。

1、最簡單

import urllib.requestresponse = urllib.request.urlopen('http://python.org/')html = response.read()

2、使用 Request

import urllib.requestreq = urllib.request.Request('http://python.org/')response = urllib.request.urlopen(req)the_page = response.read()

3、發送資料

#! /usr/bin/env python3import urllib.parseimport urllib.requesturl = 'http://localhost/login.php'user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'values = {'act' : 'login','login[email]' : 'yzhang@i9i8.com','login[password]' : '123456'}data = urllib.parse.urlencode(values)req = urllib.request.Request(url, data)req.add_header('Referer', 'http://www.python.org/')response = urllib.request.urlopen(req)the_page = response.read()print(the_page.decode("utf8"))

4、發送資料和header

#! /usr/bin/env python3import urllib.parseimport urllib.requesturl = 'http://localhost/login.php'user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'values = {'act' : 'login','login[email]' : 'yzhang@i9i8.com','login[password]' : '123456'}headers = { 'User-Agent' : user_agent }data = urllib.parse.urlencode(values)req = urllib.request.Request(url, data, headers)response = urllib.request.urlopen(req)the_page = response.read()print(the_page.decode("utf8"))

5、http 錯誤

#! /usr/bin/env python3import urllib.requestreq = urllib.request.Request('http://www.bkjia.com ')try:urllib.request.urlopen(req)except urllib.error.HTTPError as e:print(e.code)print(e.read().decode("utf8"))

6、異常處理1

#! /usr/bin/env python3from urllib.request import Request, urlopenfrom urllib.error import URLError, HTTPErrorreq = Request("http://www.bkjia.com /")try:response = urlopen(req)except HTTPError as e:print('The server couldn't fulfill the request.')print('Error code: ', e.code)except URLError as e:print('We failed to reach a server.')print('Reason: ', e.reason)else:print("good!")print(response.read().decode("utf8"))

7、異常處理2

#! /usr/bin/env python3from urllib.request import Request, urlopenfrom urllib.error import URLErrorreq = Request("http://www.bkjia.com /")try:response = urlopen(req)except URLError as e:if hasattr(e, 'reason'):print('We failed to reach a server.')print('Reason: ', e.reason)elif hasattr(e, 'code'):print('The server couldn't fulfill the request.')print('Error code: ', e.code)else:print("good!")print(response.read().decode("utf8"))

8、HTTP 認證

#! /usr/bin/env python3import urllib.request# create a password managerpassword_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()# Add the username and password.# If we knew the realm, we could use it instead of None.top_level_url = "https://www.jb51.net /"password_mgr.add_password(None, top_level_url, 'rekfan', 'xxxxxx')handler = urllib.request.HTTPBasicAuthHandler(password_mgr)# create "opener" (OpenerDirector instance)opener = urllib.request.build_opener(handler)# use the opener to fetch a URLa_url = "https://www.jb51.net /"x = opener.open(a_url)print(x.read())# Install the opener.# Now all calls to urllib.request.urlopen use our opener.urllib.request.install_opener(opener)a = urllib.request.urlopen(a_url).read().decode('utf8')print(a)

9、使用代理

#! /usr/bin/env python3import urllib.requestproxy_support = urllib.request.ProxyHandler({'sock5': 'localhost:1080'})opener = urllib.request.build_opener(proxy_support)urllib.request.install_opener(opener)a = urllib.request.urlopen("http://www.bkjia.com ").read().decode("utf8")print(a)

10、逾時

#! /usr/bin/env python3import socketimport urllib.request# timeout in secondstimeout = 2socket.setdefaulttimeout(timeout)# this call to urllib.request.urlopen now uses the default timeout# we have set in the socket modulereq = urllib.request.Request('http://www.bkjia.com /')a = urllib.request.urlopen(req).read()print(a)

總結

以上就是這篇文章的全部內容,希望本文的內容對大家學習或使用python能有所協助,如果有疑問大家可以留言交流。

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.