A relatively complete HTTP request, input IP and port, output response code, response header, response body, whether timeout, and error message On Error
Processing includes:
1. Protocol handling, if 443 with HTTPS, others with HTTP
2.HTTPError processing, Httperror is generally 401,403,404 of such errors, although the error, but also has a response to the head. Note that when you get the error message, you use STR (e), other things such as repr (e) are not strings, E.read () is the response body, not the cause of the error
3.URLError processing, generally is connection refused such as errors. Note To get the error message, use STR (E.reason)
4. Response Body gzip Decompression
5. Response Body Encoding Conversion
#Coding=utf8ImportUrllib2ImportChardetImportTracebackImportStringioImportReImportgzipdefplugin_homepage (data, timeout): IP= data["IP"] Port= data["Port"] ifPort = = 443: URL="https://%s:%s/"%(IP, port)Else: URL="http://%s:%s/"%(IP, port) is_timeout, Error_reason, code, header, body, title=get_html (URL, timeout) res= {"IP": IP,"Port": Port,"Rsp_header": Header,"Rsp_body": Body,"Code": Code,"title": Title,"Is_timeout": Is_timeout,"Error_reason": Error_reason}returnResdefget_html (URL, timeout): User_agent='mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'Headers= {'user-agent': user_agent} is_timeout=False Error_reason=None Code=None Header=None Body=None title=NoneTry: Request= Urllib2. Request (URL, headers=headers) Response= Urllib2.urlopen (Request, timeout=timeout) Code=Response.getcode () body=response.read () header=Str (response.headers)exceptUrllib2. Httperror, E:#Handling HTTP Errors #print "str (e):%s\nrepr (E):%s\ne:%s\ne.read ():%s\n"% (str (e), repr (e), E, E.read ())Error_reason =Str (e) Body=e.read () header=e.headersexceptUrllib2. Urlerror, E:Printtraceback.print_exc () Error_reason=Str (E.reason)ifError_reason = ="timed out":#determine whether to timeoutIs_timeout =Truereturnis_timeout, Error_reason, code, header, body, titleexceptException, E:Printtraceback.print_exc () Error_reason=Str (e)returnis_timeout, Error_reason, code, header, body, titleif notHeader:returnis_timeout, Error_reason, code, header, body, title#Unzip gzip if 'content-encoding' inchHeader and 'gzip' inchheader['content-encoding']: Html_data=Stringio.stringio (body) GZ= gzip. Gzipfile (fileobj=html_data) Body=Gz.read ()#Encoding Conversion Try: Html_encode=Get_encode (header, body). Strip ()ifHtml_encode andLen (Html_encode) < 12: Body= Body.decode (Html_encode). Encode ('Utf-8') except: Pass #Get title Try: Title= Re.search (r'<title> (. *?) </title>', body, Flags=re. I |Re. M)ifTitle:title= Title.group (1) except: Pass returnis_timeout, Error_reason, Code, STR (header), body, title#Get HTML EncodingdefGet_encode (header, body):Try: M= Re.search (r'<meta.*?charset= (. *?) " (>| |/)', Body, flags=Re. I)ifm:returnM.group (1). replace ('"',"') except: Pass Try: if 'Content-type' inchHeader:content_type= header['Content-type'] M= Re.search (r'. *?charset= (. *?) (;|$)', Content_Type, flags=Re. I)ifm:returnM.group (1) except: Passchardit1=Chardet.detect (body) Encode_method= chardit1['encoding'] returnEncode_methodif __name__=="__main__": Data= {"IP":"127.0.0.1","Port": 80} res= Plugin_homepage (data, 3) PrintRes
"Python" gets the HTTP response