Error message:
Traceback (most recent call last):
File "<pyshell#18>", line 1, in <module>
Callinfo = server.methods[' gettemp ']
File "soappy\client.py", line 472, in __call__
Return Self.__r_call (*args, **kw)
File "soappy\client.py", line 494, in __r_call
SELF.__HD, Self.__ma)
File "soappy\client.py", line 365, in __call
Config = self.config)
File "soappy\client.py", line 265, in call
Raise Httperror (Code, MSG)
Httperror:
401 meaning is not certified, you need to login, this site is only using 401 status code, and did not really do basic/digest certification, the page is returned, but the article content was truncated, the page also has instructions, grasping the following:
#!/usr/local/bin/python
#-*-Coding:utf-8-*-
Import Urllib2
Import zlib
headers = {
"User-agent": "mozilla/5.0" (Windows; U Windows NT 6.1; En-us; rv:1.9.1.6) gecko/20091201 firefox/3.5.6 ",
}
Try
req = Urllib2. Request ("http://www.nature.com/onc/journal/v29/n35/full/onc2010241a.html", headers = headers)
res = Urllib2.urlopen (req)
Except Urllib2. Httperror as Http_error:
Print zlib.decompress (Http_error.read (), 30)
First browser access to the Web site, found 401 error But the page normal display, see response header found no 401 related to the tag, but the content is a page.
So, there is the code above, that is, you can see the error you posted is httperror abnormal, go to the Python manual urlib2 Help, you can find this exception and who throws this exception, if you first understand URLLIB2 handler callback mechanism , each registered handler inherits base handler and has all the callback interfaces (which may not be overridden, that is, does not work), and you can quickly read that the object that is thrown with the httperror exception has the same interface as the Urlopen return value, read, Use it to read the content.
As for zlib is to extract the gzip format data, you can manual see Zlib related content (I do C, zlib use more), the reason to extract is because I noticed that the response header explained the encoding mode for gzip, A more robust program should be the one that catches the exception and then determines what format the Content-encoding field in the header dictionary in the Http_error object is.