The host in the HTTP request header of the operation in the following table must be appwk.baidu.com. All requests are in the form of post. The format of post data is as follows:
Request = {"bdi_bear": "UMTS", "bduss ":""}
Bduss is the token sent back by the server upon login. However, most operations do not require logon. This bduss can be empty.
Operation |
URL |
Note |
Returned data |
Download ranking |
Http://wenku.n.shifen.com /? RT = DL & type = 1 & Pn = 0 & Rn = 10 |
Type = 0: Recommended Type = 1: Download ranking Type = 2: Fastest Growing |
{"Result": {"tn": 60, "Count": 10}, "content" : [{"Doc_id": "5314d779168884868762d60b" , "Title": "\ u600e \ u6837 \ u624d \ u80fd \ u6000 \ u5b55" , "Size": "44032", "download_count": "3012 ", "Value_count": "1262", "value_average": "8", "ext_name": ". Doc "}, |
category view |
http://wenku.n.shifen.com /? RT = Cl & Pn = 0 & Rn = 0 |
|
{"result": {"tn": 15, "Count": 10 }, "content" : [{"cname": "\ u5c0f \ u8bf4", "CID": "79", "ishot ": "1" }, {"cname": "\ u60c5 \ u611f", "CID": "133", "ishot ": "0" }, {"cname": "\ u52b1 \ u5fd7 \/\ u54f2 \ u7406", "CID": "134 ", "ishot": "1" }, {"cname": "\ u5065 \ u5eb7", "CID": "128", "ishot ": "0" }, {"cname": "\ u6563 \ u6587 \ u968f \ u7b14", "CID": "93", "ishot ": "0" }, {"cname": "\ u6c42 \ u804c \/\ u804c \ u573a", "CID": "127 ", "ishot": "1" }, {"cname": "\ u5e7d \ u9ed8", "CID": "140", "ishot ": "0" }, {"cname": "\ u79d1 \ u666e", "CID": "135", "ishot ": "0" }, {"cname": "\ u7f8e \ u5bb9 \/\ u5851 \ u8eab", "CID": "130 ", "ishot": "1" }, {"cname": "\ u8bd7 \ u8bcd", "CID": "141", "ishot ": "0"}]} |
Classified books |
Http://wenku.n.shifen.com /? RT = CV & cid = 79 & OD = 0 & Pn = 0 & Rn = 10 |
|
{"Result": {"tn": 59966, "Count": 15}, "content ": [{"Doc_id": "9b6ff4fc04a1b0717fd5dd83" , "Title": "\ u7ecf \ u5178 \ u7b11 \ u8bdd \ u80fd \ u5fcd \ u52305 \ u4e2a \", "Value_count": "1404", "value_average": "8 ", "Download_count": "768", "size": "34816", "ext_name": ". Doc "}, |
Book details |
Http://wenku.n.shifen.com/ ? RT = DP & doc_id = abc6e50016fc700abb68fcfa |
|
{"Content": {"tag_str ":"", "Summary": "\ u6c5f \ u5927 \ u5927 \ u7684 \ u597d \ u4e66 ", "Price": "0", "CID": "538", "ext_name": ". txt "}} |
Online reading |
Http://wenku.n.shifen.com /? RT = dc & doc_id = abc6e50016fc700abb68fcfa & Pn = 0 & Rn = 5 & PW = 1000 & dt = 1 |
Returns plain text. If both Doc and PPS are converted to txt, all images and formats are lost. |
{& Quot; txt_size & quot;: 3778, & quot; Content & quot ": ["\ u6c5f \ u5357 \ u8bf4 \ u4ed6 \ u7684 \ u300a \ u7f25 \ Alibaba \ u300b \ r \ n \ Alibaba \ u6c99 \ u6f20 \ r \ N0 \ Alibaba \ u7c89 \ u4e1d \ r \ n 1 \ u697c \ r \ n \ u6211 \...... |
Download books |
Http://wenku.n.shifen.com /? RT = dc & doc_id = abc6e50016fc700abb68fcfa & Pn = 0 & Rn = 0 & PW = 1000 & dt = 1 |
Bduss is required not to be blank. The dt parameter indicates the file type. 0 indicates the original type, and 1 indicates the TXT type. |
Binary stream of books |
To download a book, you need to log on, because it requires credits to download the book.
The login authentication process is completed on another server.
The authentication API address is: http: // 220.181.112.194/passport /? The host in the login HTTP request header is wappass.baidu.com. This request also requires post. The data of post is:
TPL = wkc & CIP = 127.0.0.1 & login_username = xxx & login_loginpass = OOO & phoneid = 000000000000000 & login_verifycode = & login_bdverify = & login_bdstoken =
In this process, the username and password are in plain text. (The webpage is encrypted with TLS)
The returned result of a successful logon is:
<? XML Version = "1.0" Encoding = "UTF-8" ? > < Login_succ > < Param Key = "SSID" Value = "Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" /> < Bduss > Region ~ Bytes </ Bduss > </ Login_succ > The bduss here can be used to download books. I don't know the SSID, but it may be a guid in the number of digits. It may be bound to the user, so I am xx. However, if Baidu's library platform can be implemented in such a simple way, it would be too small to look At Baidu. Let's look at it first.
the returned result of Logon failure is:
login_fail > reason id = "4" >
[CDATA [incorrect logon password, please log on again] reason bdneedverify /> bdvcodestring /> bdbdstoken /> bdtime /> login_fail > id = 2 indicates that the user does not exist. Id = 6 indicates that the verification code is incorrect.
Sometimes (the specific logic is controlled by the server, and the password may be correct), the user is required to enter the verification code:
<? XML Version = "1.0" Encoding = "UTF-8" ? > < Login_fail > < Reason ID = "204" > <! [CDATA [enter the Verification Code] > </ Reason > < Bdneedverify > 1 </ Bdneedverify > < Bdvcodestring > Bytes </ Bdvcodestring > < Bdbdstoken > 2c5aa8bdf8b3016474d2be6a97969645</ Bdbdstoken > < Bdtime > 1307756216 </ Bdtime > </ Login_fail >
Bdvcodestring is a hexadecimal encoded data. This is the case. 0010196108554508passport130775621604010103 + guid
The red one should be the timestamp. The subsequent 04010103 has never changed. This value is applied to each request. It is followed by a guid. This guid will be used for verification, so you cannot YY it yourself. You cannot reuse the guid in other bdvcodestrings. The first 0010196108554508 can be replaced by another bdvcodestring, but the generated image is brand new.
There is no image in this response. The image must be retrieved from another place on its own. The post request address is:
Http://passport.n.shifen.com/cgi-bin/genimage? Bytes
The host is passport.baidu.com.
The server returns an image of 100*40. The image contains a four-digit number. For example, 1234. The image here is not unique. This means that there may be many images with the number 1234, If the images are dynamically generated. That number is 100*40*2 ^ 32 possible images. So you don't need to capture the image and create a mapping to crack it.
Then, you can send a login request to the login server.
TPL = wkc & CIP = 127.0.0.1 & login_username = xxx & login_loginpass = OOO & phoneid = 000000000000000 & login_verifycode = & login_bdverify = & login_bdstoken =
Login_verifycode indicates the verification code, login_bdverify indicates a long number, and login_bdstoken indicates the value in bdbdstoken.
Add a parameter to the URL. .
Http: // 220.181.112.194/passport /? Login = VC
The purpose of this verification image here is to prevent all automated, non-human access. Currently, it is widely used.
At present, the online reading and webpage version of the mobile phone platform cannot be the same. Currently, only TXT and Epub can be viewed.