Using a script to send a GET or post is one of the simplest and most frequent things; then why do I have to yy again? Because not only practice makes perfect, ripe still can produce a lot of things come, see is with who born!
I think it's necessary to go over it again. HTTP protocol and Get/post request the corresponding content and format of the basic knowledge; but I'm not going to make this brief, I want you to take a look at some of those things like "when you use a browser to open a URL, what happened"?
Python sends get/post that may be involved in Lib:urllib, Urllib2, cookielib; As for other things such as handling HTML are not covered in this topic:
Request Www.111cn.net's homepage:
>>> Import Urllib2
>>> print Urllib2.urlopen (' http://www.111cn.net '). Read ()
>>> Import Urllib2
>>> print Urllib2.urlopen (' http://www.111cn.net '). Read ()
The above is Hello world level, but geek programmers often find that the printed things, because of the URL and different---nonsense, of course, not to say content, refers to style!!! Excellent site, its source code is often in all aspects of a high level, including Unicode coding, security, performance and so on.
Simulate the browser to open a login URL and log in successfully via post:
This is a special case, so I'm going to write a slightly more comprehensive point to cover as many common situations as possible, including but not limited to: cookies, password encryption, HTTPS, simple authentication codes, IP restrictions , fully pretending to be a browser and so on.
The minimum form to send a POST request:
>>> import urllib
>>> import urllib2
> >> import cookielib
>>> CJ = Cookielib. Cookiejar ()
>>> opener = Urllib2.build_opener (urllib2. Httpcookieprocessor (CJ))
>>> opener.addheaders = [(' User-agent ', ' mozilla/4.0 ' (compatible; MSIE 7.0; Windows NT 5.1)]
>>> Urllib2.install_opener (opener)
>>> req = Urllib2. Request ("Http://xxoo.com", Urllib.urlencode ({"username": "root", "password": "Rootxxoo"})
>>> resp = urllib2.urlopen (req)
>>> print resp.read ()
>>> Imp ORT urllib
>>> import urllib2
>>> import cookielib
>>> CJ = Cookielib. Cookiejar ()
>>> opener = Urllib2.build_opener (urllib2. Httpcookieprocessor (CJ))
>>> opener.addheaders = [(' User-agent ', ' mozilla/4.0] (compatible; MSIE 7.0; Windows NT 5.1)]
>>> Urllib2.install_opener (opener)
>>> req = urllib2. Request ("Http://xxoo.com", Urllib.urlencode ({"username": "root", "password": "Rootxxoo"}))
>>> resp = urllib2.urlopen (req)
>>> Print Resp.read ()
Some caveats or best practices:
A If the post fails because of a cookie problem, it is best to log in with a real browser, and then use debugging tools such as Firebug to view the actual request and response header information, as well as cookie data!
B In addition to cookies, there are many other ways to achieve some security or other purposes, cookies can save the data is 4k, and completely open to the client.
C in fact, the first important thing is to read the source code, the naked eye to resolve the form and its action and parameters, and make clear logic, then to simulate, but this step is often the beginning of the wits stage; some programmers always use tricks to flirt with you, such as appending meaningless arguments Append random numbers and name them as variables that look like business data, and others like to play math, such as prime number arithmetic.
D Authentication code problem: Divided into many kinds, some programmers are relatively tender, so you can analyze the bypass, and the general Security verification Code, you need to request some authentication code back to build the library, do feature recognition; As for the very abnormal verification code, I suggest you give up this idea, do not have to identify!
E Online discuz! analog login, in fact, are mostly theory; the real program is in those professional posting companies there.