hasattr (E, ' Code '):
print ' The server couldn\ ' t fulfill the request. '
print ' Error code: ', E.code
Else
# everything is fine
Info and GeturlUrlopen returns an Answer object response (or Httperror instance) has two very useful methods info () and Geturl ()Geturl-This is useful as a real URL to get back, because Urlopen (or opener object) mayThere will be redirects. The URL you get may be different from the request URL.Info-Thi
In this section, let's take a look at the use of cookies.Why use cookies?Cookies, which are data stored on the user's local terminal (usually encrypted) by certain websites in order to identify users and perform session tracking.For example, some sites need to log in to access a page, before you log in, you want to crawl a page content is not allowed. Then we can use the URLLIB2 library to save our registered cookies, and then crawl the other pages to achieve the goal.Before we do, we must first
HtmlparserImport UrllibUrltext = []#定义HTML解析器Class ParseText (Htmlparser.htmlparser):def handle_data (self, data):if data! = '/n ':Urltext.append (data)#创建HTML解析器的实例Lparser = ParseText ()#把HTML文件传给解析器Lparser.feed (Urllib.urlopen (/"Http://docs.python.org/lib/module-HTMLParser.html"/). Read ())Lparser.close ()For item in Urltext:Print ItemThe above code runs out too long, skippingIv. extracting cookies from HTML documentsVery often, we all need to deal with cookies, and fortunately the Cookielib
Hello, everybody. In the last section we studied the problem of the abnormal handling of reptiles, so let's take a look at the use of cookies.Why use cookies?Cookies, which are data stored on the user's local terminal (usually encrypted) by certain websites in order to identify users and perform session tracking.For example, some sites need to log in to access a page, before you log in, you want to crawl a page content is not allowed. Then we can use the URLLIB2 library to save our registered co
pass parameters.Example:Import urllib2urllib2.urlopen (' http://www.baidu.com ', data,10) urllib2.urlopen (' http://www.baidu.com ', timeout=10)Second, opener (Openerdirector)The Openerdirector manages a collection of Handler objects that doesAll the actual work. Each Handler implements a particular protocol orOption. The Openerdirector is a composite object that invokes theHandlers needed to open the requested URL. For example, theHttpHandler perfor
open (file, mode= ' R ', Buffering=-1, Encoding=none, Errors=none, Newline=none, Closefd=true, Opener=none)Open file and return a stream. Raise IOError upon failure.#打开文件并返回一个流? Failure throws IOError exceptionMode========= ===============================================================Character meaning--------- ---------------------------------------------------------------' R ' Open for reading (default)' W ' open for writing, truncating the file fi
functionality that your users need, instead of doing a bunch of useless things to users, identify and optimize usability issues as early as possible before product development, such as the prototype design phase, and reduce the risk of failure due to not understanding the requirements properly; Minimizing or eliminating documents can also reduce expenses.
Ii. 4 Dimensions of usability assessment
Functional functionality: Whether the product is useful. For example, bottle
The basic method of python crawlers and python Crawlers
1. the most basic website capture import urllib2content = urllib2.urlopen ('HTTP: // xxxx '). read ()-2. using a proxy server is useful in some situations, such as the IP address being blocked or the number of times the IP address is accessed is limited. Import urllib2proxy_support = urllib2.ProxyHandler ({'http': 'http: // XX. XX. XX. XX: xxxx'}) opener = urllib2.build _
Hello, everybody. In the last section we studied the problem of the abnormal handling of reptiles, so let's take a look at the use of cookies.
Why use cookies?
Cookies, which are data stored on the user's local terminal (usually encrypted) by certain websites in order to identify users and perform session tracking.
For example, some sites need to log in to access a page, before you log in, you want to crawl a page content is not allowed. Then we can use the URLLIB2 library to save our registere
real URL, which is useful because urlopen (or opener object) MayThere will be redirection. The obtained URL may be different from the requested URL.Info -- the dictionary object of the returned object, which describes the page information obtained. It is usually the specific headers sent by the server. Currently, this is an httplib. HTTPMessage instance.The classic headers include "Content-length", "Content-type", and others. View Quick Reference to
variables obtained from the new window are indeed retained and still take effect. Because we reference it, even if the window is closed, its memory will not be recycled.
In reality, you can bind the click event to the document, so that you can trigger any click anywhere to obtain a pure environment.
Therefore, we have to use the pop-up function to protect it through hooks.
In addition to the most commonly used window. open, there are:
ShowModalDialog
ShowModelessDialog
window are indeed retained and still take effect. Because we reference it, even if the window is closed, its memory will not be recycled.
In reality, you can bind the click event to the document, so that you can trigger any click anywhere to obtain a pure environment.
Therefore, we have to use the pop-up function to protect it through hooks.
In addition to the most commonly used window. Open, there are:
Showmodaldialog
Showmodelessdialog
Opener
If
couldn \'t fulfill the request. 'print 'error code: ', E. codeelse: # Everything is fine
Info and geturlThe response object response (or httperror instance) returned by urlopen has two useful methods: Info () and geturl ()Geturl -- returns the obtained real URL, which is useful because urlopen (or opener object) MayThere will be redirection. The obtained URL may be different from the requested URL.Info -- the dictionary object of the returne
real URL, which is useful because urlopen (or opener object) MayThere will be redirection. The obtained URL may be different from the requested URL.
Info -- the dictionary object of the returned object, which describes the page information obtained. It is usually the specific headers sent by the server. Currently, this is an httplib. httpmessage instance.
The classic headers include "Content-Length", "Content-Type", and others. View quick reference t
#-*-Coding: UTF-8 -*-
# Python: 2.x
_ Author _ = 'admin'
Import cookielib
# Mainly used to process HTTP client cookies
# Cookielib. loaderror fails to be loaded in an exception file, which is a subclass of ioeerror.
# Cookielib. cookiejar is used to store Cookie objects. This module captures cookies and resends them when you are asking for further connection information. It can also be used to process files containing cookie data.
# Documentation: https://docs.python.org/2/library/cookielib.htm
real URL, which is useful because urlopen (or opener object) MayThere will be redirection. The obtained URL may be different from the requested URL.Info -- the dictionary object of the returned object, which describes the page information obtained. It is usually the specific headers sent by the server. Currently, this is an httplib. HTTPMessage instance.The classic headers include "Content-length", "Content-type", and others. View Quick Reference to
This change always refers to the browser window at the highest level of the split window. If you plan to execute the command from the highest level of the split window, you can use the top variable.
Parent:
This variable refers to the parent window that contains the current split window. If there is a split window in a window, and one of the split Windows contains a split window, the split window at the layer 2nd can be referenced with the parent variable to include its parent split window.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.