This article describes how to log on to the Python website and related information about the instance. For more information, see this article, for more information, see
Python logon website details and examples
For most forums, we need to log on first to capture the posts for analysis; otherwise, we cannot view them.
This is because the HTTP protocol is a Stateless (Stateless) protocol. how does the server know whether the user requesting the connection has logged on? There are two methods:
Explicitly use the Session ID in the URI;
When using cookies, a Cookie is stored locally after you log on to a website. when you continue to browse the website, the browser sends the Cookie along with the address request.
Python provides quite a variety of modules, so you can complete this network operation in just a few words. I log on to the QZZN Forum as an example. In fact, almost all PHPWind forums of the following program are applicable.
#-*-Coding: GB2312-*-from urllib import urlencodeimport cookielib, urllib2 # cookiecj = cookielib. LWPCookieJar () opener = urllib2.build _ opener (urllib2.HTTPCookieProcessor (cj) urllib2.install _ opener (opener) # Loginuser_data = {'pwuser': 'Your username ', 'pwpwd ': 'Your password', 'step': '2'} url_data = urlencode (user_data) login_r = opener. open ("http://bbs.qzzn.com/login.php", url_data)
Some notes:
Urllib2 is obviously more advanced than urllib, which includes how to use Cookies.
In urllib2, each client can use an opener for abstraction, and each opener can add multiple handler to enhance its functions.
HTTPCookieProcessor is specified as handler when constructing opener. Therefore, this handler supports Cookie.
After isntall_opener is used, this opener is used when urlopen is called.
If you do not need to save the Cookie, the cj parameter can be omitted.
User_data stores the information required for logon. you can pass this information when you log on to the forum.
The urlencode function encodes the dictionary user_data "? Pwuser = username & pwpwd = password "to make the program easier to read.
The last problem is where names such as pwuser and pwpwd come from, so we need to analyze the webpage to be logged on. We know that the common logon interface is a form. the excerpt is as follows:
From this we can see that the user name and password we need to enter correspond to pwuser and PWD, while the step corresponds to logon (this is an attempt ).
Note that this forum form uses the post method. if it is the get method, the method in this article needs to be changed. instead of open directly, you should first Request and then open. For more details, see the manual...
The above is a detailed description of the Python website logon method. For more information, see other related articles in the first PHP community!