Communication Protocol Analysis of mobile phone version of Baidu Library

Source: Internet
Author: User

The host in the HTTP request header of the operation in the following table must be appwk.baidu.com. All requests are in the form of post. The format of post data is as follows:

Request = {"bdi_bear": "UMTS", "bduss ":""}

Bduss is the token sent back by the server upon login. However, most operations do not require logon. This bduss can be empty.

Operation URL Note Returned data
Download ranking Http://wenku.n.shifen.com /? RT = DL & type = 1 & Pn = 0 & Rn = 10 Type = 0: Recommended
Type = 1: Download ranking
Type = 2: Fastest Growing

{"Result": {"tn": 60, "Count": 10}, "content"

: [{"Doc_id": "5314d779168884868762d60b"

, "Title": "\ u600e \ u6837 \ u624d \ u80fd \ u6000 \ u5b55"

, "Size": "44032", "download_count": "3012 ",

"Value_count": "1262", "value_average": "8", "ext_name": ". Doc "},

category view http://wenku.n.shifen.com /? RT = Cl & Pn = 0 & Rn = 0

{"result": {"tn": 15, "Count": 10 }, "content"

: [{"cname": "\ u5c0f \ u8bf4", "CID": "79", "ishot ": "1" },

{"cname": "\ u60c5 \ u611f", "CID": "133", "ishot ": "0" },

{"cname": "\ u52b1 \ u5fd7 \/\ u54f2 \ u7406", "CID": "134 ", "ishot": "1" },

{"cname": "\ u5065 \ u5eb7", "CID": "128", "ishot ": "0" },

{"cname": "\ u6563 \ u6587 \ u968f \ u7b14", "CID": "93", "ishot ": "0" },

{"cname": "\ u6c42 \ u804c \/\ u804c \ u573a", "CID": "127 ", "ishot": "1" },

{"cname": "\ u5e7d \ u9ed8", "CID": "140", "ishot ": "0" },

{"cname": "\ u79d1 \ u666e", "CID": "135", "ishot ": "0" },

{"cname": "\ u7f8e \ u5bb9 \/\ u5851 \ u8eab", "CID": "130 ", "ishot": "1" },

{"cname": "\ u8bd7 \ u8bcd", "CID": "141", "ishot ": "0"}]}

Classified books Http://wenku.n.shifen.com /? RT = CV & cid = 79 & OD = 0 & Pn = 0 & Rn = 10  

{"Result": {"tn": 59966, "Count": 15}, "content ":

[{"Doc_id": "9b6ff4fc04a1b0717fd5dd83"

, "Title": "\ u7ecf \ u5178 \ u7b11 \ u8bdd \ u80fd \ u5fcd \ u52305 \ u4e2a \",

"Value_count": "1404", "value_average": "8 ",

"Download_count": "768", "size": "34816", "ext_name": ". Doc "},

Book details

Http://wenku.n.shifen.com/

? RT = DP & doc_id = abc6e50016fc700abb68fcfa

 

{"Content": {"tag_str ":"",

"Summary": "\ u6c5f \ u5927 \ u5927 \ u7684 \ u597d \ u4e66 ",

"Price": "0", "CID": "538", "ext_name": ". txt "}}

Online reading

Http://wenku.n.shifen.com /? RT = dc & doc_id = abc6e50016fc700abb68fcfa & Pn = 0

& Rn = 5 & PW = 1000 & dt = 1

Returns plain text. If both Doc and PPS are converted to txt, all images and formats are lost.

{& Quot; txt_size & quot;: 3778, & quot; Content & quot ": ["\ u6c5f \ u5357 \ u8bf4 \ u4ed6 \ u7684 \ u300a \ u7f25 \ Alibaba \ u300b \ r \ n \ Alibaba \ u6c99 \ u6f20 \ r \ N0 \ Alibaba \ u7c89 \ u4e1d \ r \ n 1 \ u697c \ r \ n \ u6211 \......

Download books

Http://wenku.n.shifen.com /? RT = dc & doc_id = abc6e50016fc700abb68fcfa

& Pn = 0 & Rn = 0 & PW = 1000 & dt = 1

Bduss is required not to be blank. The dt parameter indicates the file type. 0 indicates the original type, and 1 indicates the TXT type. Binary stream of books

 

To download a book, you need to log on, because it requires credits to download the book.

The login authentication process is completed on another server.

The authentication API address is: http: // 220.181.112.194/passport /? The host in the login HTTP request header is wappass.baidu.com. This request also requires post. The data of post is:

TPL = wkc & CIP = 127.0.0.1 & login_username = xxx & login_loginpass = OOO & phoneid = 000000000000000 & login_verifycode = & login_bdverify = & login_bdstoken =

In this process, the username and password are in plain text. (The webpage is encrypted with TLS)

The returned result of a successful logon is:

 <?  XML   Version  = "1.0"   Encoding  = "UTF-8" ? >  <  Login_succ  >      <  Param  Key  = "SSID"   Value  = "Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"   />      <  Bduss  > Region ~ Bytes </  Bduss  >  </  Login_succ  > The bduss here can be used to download books. I don't know the SSID, but it may be a guid in the number of digits. It may be bound to the user, so I am xx. However, if Baidu's library platform can be implemented in such a simple way, it would be too small to look At Baidu. Let's look at it first.
 the returned result of Logon failure is: 
    login_fail  >    reason   id   = "4"  >  
    [CDATA [incorrect logon password, please log on again]      reason       bdneedverify  />     bdvcodestring  />     bdbdstoken  />     bdtime  />     login_fail  > id = 2 indicates that the user does not exist. Id = 6 indicates that the verification code is incorrect. 
Sometimes (the specific logic is controlled by the server, and the password may be correct), the user is required to enter the verification code:
 <?  XML   Version  = "1.0"   Encoding  = "UTF-8" ? >    <  Login_fail  >   <  Reason   ID  = "204"  >   <! [CDATA [enter the Verification Code] >    </ Reason  >   <  Bdneedverify  > 1 </  Bdneedverify  >    <  Bdvcodestring  > Bytes </  Bdvcodestring  >    <  Bdbdstoken  > 2c5aa8bdf8b3016474d2be6a97969645</  Bdbdstoken  >    <  Bdtime  > 1307756216 </  Bdtime  >   </  Login_fail  > 

Bdvcodestring is a hexadecimal encoded data. This is the case. 0010196108554508passport130775621604010103 + guid

The red one should be the timestamp. The subsequent 04010103 has never changed. This value is applied to each request. It is followed by a guid. This guid will be used for verification, so you cannot YY it yourself. You cannot reuse the guid in other bdvcodestrings. The first 0010196108554508 can be replaced by another bdvcodestring, but the generated image is brand new.

There is no image in this response. The image must be retrieved from another place on its own. The post request address is:

Http://passport.n.shifen.com/cgi-bin/genimage? Bytes

The host is passport.baidu.com.

The server returns an image of 100*40. The image contains a four-digit number. For example, 1234. The image here is not unique. This means that there may be many images with the number 1234, If the images are dynamically generated. That number is 100*40*2 ^ 32 possible images. So you don't need to capture the image and create a mapping to crack it.

Then, you can send a login request to the login server.

TPL = wkc & CIP = 127.0.0.1 & login_username = xxx & login_loginpass = OOO & phoneid = 000000000000000 & login_verifycode = & login_bdverify = & login_bdstoken =

Login_verifycode indicates the verification code, login_bdverify indicates a long number, and login_bdstoken indicates the value in bdbdstoken.

Add a parameter to the URL. .

Http: // 220.181.112.194/passport /? Login = VC

The purpose of this verification image here is to prevent all automated, non-human access. Currently, it is widely used.

At present, the online reading and webpage version of the mobile phone platform cannot be the same. Currently, only TXT and Epub can be viewed.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.