Get Web page
#!/usr/bin/env Pythonimport sys,urllib2req=urllib2. Request (Sys.argv[1]) Fd=urllib2.urlopen (req) while True:data=fd.read (1024x768) if not Len (data): Break Sys.stdo Ut.write (data)
First, a URLLIB2 is established. The request object, which uses a URL to make parameters. Then call Urlopen to get a file class object. Of course Urlopen can also directly use the URL as a parameter. In addition, there is the Geturl () function, which is used to get the URL of the source, which, in general, can be traced to the redirected page. the info () function is used to get the meta-information of the page.
Certification
#!/usr/bin/env pythonimport sys,urllib2,getpassclass terminalpassword (URLLIB2. Httppasswordmgr): def find_user_password (Self,realm,authuri): retval=urllib2. Httppasswordmgr.find_user_password (Self,realm,authuri) if retval[0]==none and retval[1]==none: sys.stdout.write ("login required for %s\n" % (Realm,authuri)) sys.stdout.write (' Username: ') username=sys.stdin.readline (). Strip () password=getpass.getpass (). Rstrip () return (Username,password)   &NBsp; else: Return retvalreq=urllib2. Request (Sys.argv[1]) Opener=urllib2.build_opener (urllib2. Httpbasicauthhandler (Terminalpassword ())) Fd=opener.open (req) print ' Retrieved ', Fd.geturl () Info=fd.info () For key,value in info.items (): print '%s = %s ' % (Key,value)
This program defines a Terminalpassword class that allows the program to ask the operator for a user name and password when needed, and another call is Build_opener (). This function allows you to specify additional handlers, usually with some handlers (such as basic HTTP and FTP support) by default, while other handlers can optionally be added. Because this code supports Basic authentication, Httpbasicauthhandler must be added to the handler chain. In the previous example, the code simply called Urllib2.urlopen (), which called Build_opener () internally, without any parameters. This causes only the default handler to be selected. Note that once the connection is opened, there will be no change. If authentication is required, Httpbasicauthhandler automatically calls the appropriate function inside the Terminalpassword without further checking. Also note that if you visit a normal site that does not require authentication, the code behaves as it did in the previous section.
Submit form data
CGI scripts and other interactive server-side programs often receive data from Web clients, typically from forms. There are two ways of submitting form data: Get and post. Which method to use depends on the method parameters in the <form> tag in the HTML document.
1. Submit with Get method
#!/usr/bin/env pythonimport sys,urllib2,urllibdef addgetdata (Url,data): "" "Adds data to url. Data should be a list or tuple consisting of 2-item lists or tuples of the form: (key,value) . items that have no key should have key set to none. A given key may occur more than once. "" " return url+ '? +urllib.urlencode (data) zipcode=sys.argv[1]url=addgetdata (' http://www.wunderground.com/cgi-bin/findweather/ Getforecast ', [(' Query ', ZipCode)]) print ' Using url ', urlreq=urllib2. Request (URL) fd=urllib2.urlopen (req) while true: data=fd.read (1024x768) if not len (data): break sys.stdout.write (data)
2. Submit using POST method
The encoded data is sent in a separate part of the request. Post is a good method when you need to exchange large amounts of data. Example:
#!/usr/bin/env pythonimport sys,urllib2,urllibzipcode=sys.argv[1]url= ' http://www.wunderground.com/cgi-bin/ Findweather/getforecast ' Data=urllib.urlencode ([(' Query ', ZipCode)]) req=urllib2. Request (URL) fd=urllib2.urlopen (req,data) while True:data=fd.read (1024x768) if not Len (data): Break Sys.stdout. Write (data) #附加的细腻希通过第二个参数传递给urlopen ()
Handling Errors
#!/usr/bin/env Pythonimport sys,urllib2req=urllib2. Request (Sys.argv[1]) Try:fd=urllib2.urlopen (req) except URLLIB2. Httperror,e:print ' Error retrieving data: ', e print ' Server Error document follows:\n ' Print e.read () sys.exit (1) Except URLLIB2. Urlerror,e:print ' Error Retrieving data: ', E sys.exit (2) print ' Retrieved ', Fd.geturl () Info=fd.info () for Key,value in Info.items (): print '%s =%s '% (key,value)
There are two different problems with reading data: One is a communication error, which causes the socket module to generate SOCKET.ERROR when invoking the read () function, and the second is to truncate the document sent without a communication error. The first case can be handled by handling the socket error method. For the second situation, it is somewhat difficult.
It is entirely possible for the client to receive the truncated document without any exceptions. For example, when a program sends a document, the server has a problem, the server problem causes the remote socket to shut down normally, so the user's client will simply receive a file end flag, without any exception.
The way to check this is to find the content-Length header in the server's answer, which can be checked against the length of the received data and the length provided in the header. However, the content-length header is not always provided, especially if the CGI-generated page does not contain this header. In this case there is no way to check whether the file has been truncated. Example:
#!/usr/bin/env pythonimport sys,urllib2,socketreq=urllib2. Request (Sys.argv[1]) Try: fd=urllib2.urlopen (req) except urllib2. httperror,e: print ' Error retrieving data: ',e print ' server error document follows:\n ' print e.read () sys.exit (1) except urllib2. urlerror,e: print ' Error retrieving data: ',e Sys.exit (2) print ' Retrieved ', Fd.geturl () bytesread=0while true: try: data=fd.read (1024x768) except socket.error,e : print ' Error reading data: ',e sys.exit (3) if not len (data): breAk bytesread+=len (data) sys.stdout.write (data) If fd.info (). Has_key (' Content-length ') and long (Fd.info () [' Content-length '])!=long (bytesread): print ' expected a document of size %d , but read %d bytes ' % (Long (Fd.info () [' content-length '), Bytesread) sys.exit (4)
This article from "Lotus's Thoughts" blog, please be sure to keep this source http://liandesinian.blog.51cto.com/7737219/1555361
Chapter 6th Web Client Access