This article describes how to use the rsa encryption algorithm module to simulate Sina Weibo logon in python. for details, refer to when using PC to log on to Sina Weibo, in the client, username and password are encrypted in advance with js, and a set of parameters will be obtained before POST, which will also be part of POST_DATA. In this way, you cannot simulate POST login (such as Renren) using the simple method ).
Retrieving Sina Weibo data through crawlers is essential for simulating logon.
1. before submitting a POST request, GET needs to GET four parameters (servertime, nonce, pubkey, and rsakv). instead of getting simple servertime, nonce, this is mainly because js has changed the user name and password encryption method.
1.1 as the encryption method changes, we will use the RSA module here. For details about the RSA public key encryption algorithm, refer to the relevant content in the network. Download and install the rsa module:
Download: https://pypi.python.org/pypi/rsa/3.1.1
Rsa module document address: http://stuvel.eu/files/python-rsa-doc/index.html
Uninstall (install setuptool on win download from here: setuptools-0.6c11.win32-py2.6.exe installation file) for installation, for example: easy_install rsa-3.1.1-py2.6.egg, the final command line test import rsa, if no error is reported, the installation is successful.
1.2 obtain and view Sina Weibo login js files
View the source code for the Sina pass url (http://login.sina.com.cn/signup/signin.php), where you can find the url of the js repository.
1.3 login
Log on to the first step, add your username and request the prelogin_url link address:
Prelogin_url = 'http: // login.sina.com.cn/sso/prelogin.php? Entry = sso & callback = sinaSSOController. preloginCallBack & su = % s & rsakt = mod & client = ssologin. js (v1.4.4) '% username
Use the get method to get the following similar content:
SinaSSOController. preloginCallBack ({"retcode": 0, "servertime": 1362041092, "pcid": "gz-6664c3dea2bfdaa3c94e8734c9ec2c9e6a1f", "nonce": "IRYP4N", "pubkey": "success ", "rsakv": "1330428213", "exectime": 1 })
Then extract the servertime, nonce, pubkey, and rsakv we want. Of course, the pubkey and rsakv values can be written to the code and they are fixed values.
2. username is calculated by BASE64:
The code is as follows:
Username _ = urllib. quote (username)
Username = base64.encodestring (username) [:-1]
Password is encrypted three times by SHA1, and the servertime and nonce values are added to the password to interfere. That is, after two SHA1 encryption operations, add the servertime and nonce values to the result, and then calculate SHA1 again.
In the latest rsa encryption method, username is still the same as before;
The password encryption method is different from the original one:
2.1 Create an rsa public key first. The two parameters of the public key are fixed on Sina Weibo, but they are all hexadecimal strings. The first one is to log on to the pubkey in the first step, the second is '123' in the js encrypted file '.
These two values need to be first converted from hexadecimal to hexadecimal, but they can also be written to the code. Here we will write 10001 to 65537. The code is as follows:
The code is as follows:
RsaPublickey = int (pubkey, 16)
Key = rsa. PublicKey (rsaPublickey, 65537) # create a public key
Message = str (servertime) + '\ t' + str (nonce) +' \ n' + str (password) # obtained by splicing the plaintext js encrypted file
Passwd = rsa. encrypt (message, key) # encryption
Passwd = binascii. b2a_hex (passwd) # Convert the encrypted information to hexadecimal notation.
2.2 Request pass url: login_url = 'http: // login.sina.com.cn/sso/login.php? Client = ssologin. js (v1.4.4 )'
Header information to be sent
The code is as follows:
PostPara = {
'Entry ': 'Weibo ',
'Gateway': '1 ',
'From ':'',
'Savestate': '7 ',
'Userticket': '1 ',
'Ssosimplelogin': '1 ',
'Vsnf ': '1 ',
'Vsnval ':'',
'Su': encodedUserName,
'Service': 'miniblog ',
'Servertime': servertime,
'Nonce ': nonce,
'Pwencode': 'rsa2 ',
'Sp ': encodedPassWord,
'Encoding': 'utf-8 ',
'Prelt ': '123 ',
'Rsak': rsakv,
'URL': 'http: // weibo.com/ajaxlogin.php? Framelogin = 1 & callback = parent. sinaSSOController. feedBackUrlCallBack ',
'Returntype': 'meta'
}
Rsakv is added to the request, and the value of pwencode is changed to rsa2. the others are the same as before.
Organize parameters and POST requests. Check whether the login is successful. refer to the POST-obtained content in the location. replace ("http://weibo.com/ajaxlogin.php? Framelogin = 1 & callback = parent. sinaSSOController. feedBackUrlCallBack & retcode = 101 & reason = % B5 % C7 % C2 % BC % C3 % FB % BB % F2 % C3 % DC % C2 % EB % B4 % ED % CE % F3 ");
If the retcode is 101, the logon fails. The result after successful logon is similar, but the retcode value is 0.
3. after successful login, the url in the replace information in the body is the url we will use next. Then, the above url uses the GET method to send a request to the server, saving the Cookie information of this request, which is the login Cookie we need.