Parse Baidu search result link? Url = parameter analysis (full)

Source: Internet
Author: User
Tags php form

A few days ago, I wrote an article about the URL that can be obtained after Baidu's jump. Some people have also studied Baidu link? Url =.

The following result is obtained:

1. the encryption method is based on: Random + input stay time + snapshot address for encryption
2. The entire code should have three parts: 1. The time of the search term; 2. the keywords of the search; 3. The randomly generated Unique Identification Code.
3. url = the last piece of similar code in any environment or browser
From the results of the above research, we can know that "there is a similar piece of code at the end" is relatively available, so we should start from here.
I searched for "enba" and found that all the URLs of my first search results have the same code, that is
Http://www.baidu.com/link? Url = ............ Ebac5573358cc3c0659257bfcf54763ec1c5ecff3b3fbd1d4c
Code for all search results: ebac5573358cc3c0659257bfcf54 (found after N searches)
The 763ec1c5ecff3b3fbd1d4c at the end looks like the real URL of the search result. (The ciphertext has been verified as a real URL)
I verified it like this:
1. Baidu first searches for www.php100.com
Link to the first result:
Http://www.baidu.com/link? Url = ............ Ebac5573358cc3c0659257bfcf546427d417fef6656de2404d6843da27
See the previous 6427dda-fef6656de2404d6843da27
2. Search www.hao123.com on Baidu.
Link to the first result:
Http://www.baidu.com/link? Url = ............ Ebac5573358cc3c0659257bfcf54 6427dda-e6ff7a6de0434d6843da
See the previous 6427d415e6ff7a6de0434d6843da
......
After multiple searches for N websites, we found that the first few domain names were "www.", and the ciphertext values were all 6427d385.
And www. Is four characters, and the ciphertext 6427d385 is eight characters. We can know that two ciphertext characters are equal to one url character.
So I wrote a php form to query and obtain the ciphertext for later viewing.
Generate a php source code:
Copy codeThe Code is as follows:
<Html>
<Head>
<Meta http-equiv = "content-type" content = "text/html; charset = UTF-8">
<Title> query Baidu link? Ulr = real link form </title>
</Head>
<Body>
<? Php
/*
Getrealurl: Get the URL address after 301 and 302 redirection by enenba.com
@ Param str $ url Query
$ Return str: Specifies the actual url of the targeted url.
*/
Function getrealurl ($ url ){
$ Header = get_headers ($ url, 1 );
If (strpos ($ header [0], '000000') | strpos ($ header [0], '000000 ')){
If (is_array ($ header ['location']) {
Return $ header ['location'] [count ($ header ['location'])-1];
} Else {
Return $ header ['location'];
}
} Else {
Return $ url;
}
}
$ Input = '<form method = "get" action = ""> <input type = "text" name = "url" id = "url" style = "width: 800px; "/> <input type =" submit "value =" submit "/> </form> <body> $ Url = isset ($ _ GET ['url'])? $ _ GET ['url']: '';
If (empty ($ url) exit ($ input );
$ Urlreal = getrealurl ($ url );
Echo 'actual url: '. $ urlreal;
$ Urlreal = ltrim ($ urlreal, 'HTTP ://');
$ Search = '/ebac5573358cc3c0659257bfcf54 ([0-9a-f] +)/I ';
Preg_match ($ search, $ url, $ r );
$ Url_encode = $ r [1]; unset ($ r );
Echo '<br/> ciphertext:'. $ url_encode. '<br/> ';
$ Urlreal_arr = str_split ($ urlreal );
$ Url_encode_arr = str_split ($ url_encode, 2 );
Echo '<br/> ';
Echo $ input;
?>

Previous online example: Slam Test
Study again tomorrow, To be continued ....
This site stated in advance that the articles on cnbeta were not published by me. My analysis is only based on my own ideas. I just want to find a process. As for whether there are any results, I have my own conclusions.
After reading the long code of the Baidu result url in the previous article, we found that only numbers and letters a to f are in the password, that is, hexadecimal code.
The hexadecimal format is from 0 to> 1-> 2-> 3-> 4-> 5-> 7-> 8-> 9-> a-> B-> c-> d-> e-> f
I collected a series of URLs and counted the first code.
Ebac5573358cc3c0659257bfcf54XX ......
The url corresponding to XX code is as follows:
Copy codeThe Code is as follows:
33 0 23 @ 13 P 03 '73 p 63
! 32 1 22 A 12 Q 02 a 72 q 62
"31 2 21 B 11 R 01 B 71 r 61
#30 3 20 C 10 S 00 c 70 s 60
$37 4 27 D 17 T 07 d 77 t 67
% 36 5 26 E 16 U 06 e 76 u 66
& 35 6 25 F 15 V 05 f 75 v 65
'34 7 24G 14 W 04g 74 w 64
(3b 8 2b H 1b X 0b h 7b x 6b
) 3a 9 2a I 1a Y 0a I 7a y 6a
* 39: 29 J 19 Z 09 j 79 z 69
+ 38; 28 K 18 [08 k 78 {68
, 3f <2f L 1f \ 0f l 7f | 6f
-3e = 2e M 1e] 0e m 7e} 6e
. 3d> 2d N 1d ^ 0d n 7d ~ 6d
/3c? 2c O 1c _ 0c o 7c 6c

It should be a character in the ascii code table, but the order should be confused. However, this is the case in a single hexadecimal system:
3-> 2-> 1-> 0-> 7-> 6-> 5-> 4-> B-> a-> 9-> 8-> f-> e -> d-> c
A descending order of four digits indicates that the population is decreasing.
But it is puzzled that the numbers from _ to 'are adjacent to the 0c and 73 in ascii. No way. I can't see the rule. Let's look at the second bit of code.
Ebac5573358cc3c0659257bfcf54XXYY ....
The url corresponding to the YY Code is as follows:
Copy codeThe Code is as follows:

70 0 60 @ 50 P 40 '30 p 20
! 71 1 61 A 51 Q 41 a 31 q 21
"72 2 62 B 52 R 42 B 32 r 22
#73 3 63 C 53 S 43 c 33 s 23
$74 4 64 D 54 T 44 d 34 t 24
% 75 5 65 E 55 U 45 e 35 u 25
& 76 6 66 F 56 V 46 f 36 v 26
'77 7 67G 57 W 47g 37 w 27
(78 8 68 H 58X48 h 38x28
) 79 9 69 I 59 Y 49 I 39 y 29
* 7a: 6a J 5a Z 4a j 3a z 2a
+ 7b; 6b K 5b [4b k 3b {2b
, 7c <6c L 5c \ 4c l 3c | 2c
-7d = 6d M 5d] 4d m 3d} 2d
. 7e> 6e N 5e ^ 4e n 3e ~ 2e
/7f? 6f O 5f _ 4f o 3f 2f

The secret of the second digit follows the ascending hexadecimal order.
0-> 1-> 2-> 3-> 4-> 5-> 7-> 8-> 9-> a-> B-> c-> d-> e -> f
The total number is decreasing.
Let's look at the third group.
Ebac5573358cc3c0659257bfcf54XXYYZZ ....
The url corresponding to the ZZ code is as follows:
Copy codeThe Code is as follows:
84 0 94 @ a4 P b4 'c4 p d4
! 85 1 95 A a5 Q b5 a c5 q d5
"86 2 96 B a6 R b6 B c6 r d6
#87 3 97 C a7 S b7 c c7 s d7
$80 4 90 D a0 T b0 d c0 t d0
% 81 5 91 E a1 U b1 e c1 u d1
& 82 6 92 F a2 V b2 f c2 v d2
'83 7 93G a3 W b3 g c3 w d3
(8c 8 9c H ac X bc h cc x dc
) 8b 9 9b I AB Y bb I cd y dd
* 8e: 9e J AE Z be j ce z de
+ 8f; 9f K af [bf k cf {df
, 88 <98 L a8 \ b8 l c8 | d8
-89 = 99 M a9] b9 m c9} d9
. 8a> 9a N aa ^ ba n ca ~ Da
/8b? 9b O AB _ bb o cb db

The order above is not explained:
4-> 5-> 6-> 7-> 0-> 1-> 2-> 3-> 4-> c-> B-> e-> f-> 8 -> 9-> a-> B
The total number is increasing.
We haven't looked at the subsequent digits yet, but we can probably know that there are four digits in a group of mixed hexadecimal obfuscation. As a result, increasing or decreasing requires a certain amount of data to be determined.
The next time you collect 1000 pieces of url data for determination.
To be continued ....

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.