This article describes how to obtain the absolute address of a Youku video. For more information, see
This article describes how to obtain the absolute address of a Youku video. For more information, see
Some time ago, a series of video site studies were conducted to study KnLiveCommentary. Since KnLiveCommentary needs to be able to obtain sufficient video sources for testing, we chose a relatively large video website of Youku (Youku) for testing.
In fact, I started to study the resolution of absolute addresses to study the built-in player of Youku and remove advertisements or something. Later, we used the player of Youku to "decompile" it with ASV6 (ActionScript Viewer 6), achieving amazing results.
The Youku video adopts encryption + dynamic acquisition. The video address needs to be dynamically obtained from the website, and the result still needs to be decrypted.
The Code is as follows:
$ Base_url = 'HTTP: // v.youku.com/player/getPlayList/VideoIDS/'; // obtain the base address of the video information.
$ _ VIDEO_ID = $ _ GET ['vid']; // extract the Video Id from GET
If ($ _ VIDEO_ID = '')
$ _ VIDEO_ID = 'xmjy0ode1mda0'; // when testing,
$ Ch = curl_init (); // enable cURL object
Curl_setopt ($ ch, CURLOPT_URL, $ base_url. $ _ VIDEO_ID); // obtain the address of the video information
Curl_setopt ($ ch, CURLOPT_HEADER, 1); // HEADER
Curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, 1 );
Curl_setopt ($ ch, CURLOPT_REFERER, 'HTTP: // v.youku.com/v_show/id_'. $ _ VIDEO_ID); // give a fake "REFERER"
Curl_setopt ($ ch, CURLOPT_USERAGENT, $ _ SERVER ['HTTP _ USER_AGENT ']); // pass the current browser User Agent to the SERVER
Curl_setopt ($ ch, CURLOPT_NOBODY, 0 );
$ Content = curl_exec ($ ch); // Execute !!!
Curl_close ($ ch);/* parse below */
Preg_match ('~" Seed "\ s *: \ s * (\ d +) \ s *,~ IUs ', $ content, $ seed );
Preg_match ('~ \ {\ S * "(flv | mp4)" \ s *: \ s * "(. *)" \ s *\}~ IUs ', $ content, $ encoded );
Preg_match ('~" Key1 "\ s *: \ s *" (. *) "\ s *,~ IUs ', $ content, $ key1 );
Preg_match ('~" Key2 "\ s *: \ s *" (. *) "\ s *,~ IUs ', $ content, $ key2 );
// Extract necessary information from the returned JSON string: seed, encoded_url, key1, key2
Class decoder {
Var $ randomSeed = 0;
Var $ cg_str = "";
Function _ construct ($ seed ){
$ This-> randomSeed = $ seed;
}
Function ran (){
$ This-> randomSeed = ($ this-> randomSeed * 211) + 30031) % 65536;
Return ($ this-> randomSeed/65536); // calculate the new Seed based on the old Seed and return the proportional position of a Seed)
}
Function cg_hun () {// It is estimated that this is called "CG mixing". The Hong Kong Space. The function of ASV solution is called this name anyway.
$ This-> cg_str = "";
$ Sttext = 'abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz/\:. _-1234567890 '; // default string (maximum)
$ Len = strlen ($ sttext); // get its length
For ($ I = 0; $ I <$ len; $ I ++ ){
$ Cuch = (int) ($ this-> ran () * strlen ($ sttext); // obtain the character subscript of the string Seed proportional position
$ This-> cg_str. = $ sttext [$ cuch]; // read the letter
$ Sttext = str_replace ($ sttext [$ cuch], ", $ sttext); // Delete the read letter (stop when it reaches 0)
}
}
Function decode ($ string ){
$ Output = "";
$ This-> cg_hun ();
$ Expl = explode ('*', $ string); // scatter the string 1*23*34*45*56 *
For ($ I = 0; $ I $ Output. = $ this-> cg_str [(int) $ expl [$ I]; // retrieve the cg_hun represented by digits to disrupt string characters, which has been decrypted since then
}
Return $ output; // OK
}
Function decode_key ($ key1, $ key2 ){
$ Key = hexdec ($ key1); // both keys are HEX
$ Key = $ key ^-1520786011; // This was originally an 8-bit HEX, And I used a calculator to calculate the value, because it is convenient for PhP bit operations.
Return $ key2. dechex ($ key); // synthesize the final Key
}
} // Decryption class. It is convenient to use $ new = new decoder (int) $ seed [1]);
$ Fileid = $ new-> decode ($ encoded [2]);
$ Key = $ new-> decode_key ($ key1 [1], $ key2 [1]);
// Feed the data and computation // address Load
$ S7 = substr ($ fileid, 10, strlen ($ fileid ));
$ S5 = substr ($ fileid, 0, 8 );
$ S6 = substr ($ fileid, 6, 2 );
// Split $ s4 = '00'; // note that this is a HEX value, that is, 00 indicates the first video segment, website space, U.S. server, 01 second 0f 15th... So on $ sid = time (). mt_rand (10, 99 ). '123 ′. mt_rand (30,80 ). '00'; // obtain a random SID and send it to the server (which will not be checked)
$ D_ADDR = ''. $ sid. '_'. $ s4. '/st/'. $ encoded [1]. '/fileid/'. $ file_id;
Echo $ d_ADDR .'? K = '. $ key;
// Finally output the address
Please note that the Youku replacement algorithm/format method cannot handle all the situations, so let me describe the current process:
1. Access [ID]
2. Get the file and parse "streamfileids": {"flv": "encrypted address", "mp4": "encrypted address", "and so on": "encrypted address"
3. Use the above method to crack the encrypted address
4. Obtain the number of segments and K
{"Mp4": [{"no": "0", "size": "18367795", "seconds": "421", "k": "281ff2875db680bb261c02ce"}, {"no": "1", "size": "19045091", "seconds": "421", "k": "45398cdd4aa44968261c02ce"},
......
5. merged address, but the K of each segment uses the new K obtained above