Key Technologies of Baidu Library

Source: Internet
Author: User

Baidu Library has two key technologies: first, flash conversion of documents. This step can be implemented using flashpaper. The key is that if flashpaer is split into small Flash files suitable for fast download from the Internet, this step is now basically no problem; the second step is the development of the reader, of course, if you are familiar with the flash structure, you can directly reference the Baidu Library Reader to change the line. After solving the above two key problems, you can make a perfect "Baidu Library". So far, it can be said that the first step is no problem, however, I am not very familiar with the flash structure, so I want to cooperate with people who are familiar with the flash structure to create brilliant achievements.

 

Recently, I am always asked about Baidu Library technology or flex paper technology. Haha. In fact, this is not as difficult as you think. As far as Baidu Library is concerned, it can prove that both Adobe's flash paper and open-source flash2print are not so intelligent. Why intelligence? You can analyze the SWF files cached in your Baidu Library in IE. Each SWF file is spliced by several SWF files, and each CWS file is a SWF file. A {"totalpage": "210", "frompage": "1", "topage": "6"} field is displayed at the beginning of the file. This can be treated as JSON or as3. Of course, to use it, you must first serialize it.
Totalpage refers to the total number of pages in this article.
How many pages does frompage come from.
Topage is the number of pages to go.
Of course, we do not have the ability to develop such a software, so we need to consider the user experience when using the open-source flash2print request, so when the server generates the SWF file, generally, you can select a spelling page.
The data sent from the server on the first page of the request is as follows: 1-6 pages, why 1-6 pages, because of the Internet bandwidth in China. Haha. The data on the second page is 2-7 pages.
Then the corresponding request:

Page ID page number

1 1-6
2 2-7
..
..
10 6-15
This is a very simple rule, because we need to consider the user's page Jump, so we need to go to the first four pages and the last five pages of the user's page number, this is done to help users better experience and reduce the pressure on the server.
We can call and send the data in the background at the front end. Haha, you can write a player. First of all, we need to consider a problem: the loader is the loader, and the conventional loader cannot be used, because the binary file is returned, so we can't use the loader in the display package to parse it. because it is not a normal SWF, it is because a custom object is added at the beginning of the binary file and needs to be parsed.
Excuse me, sir? You said you cannot use loader. What do you use? Urlstream? Urlloader ?, You can do whatever you like. Urlloader is the simplest. Haha. You can use it. The method of adding binary is actually equivalent to refining the data and letting AVM know you, so that you can play the video. But I remember it was silly when I was doing it. Haha. If you don't understand it, you can write it as soon as you know it, and the result efficiency is greatly reduced. The source code is as follows: Drop the effect especially. If you don't expect low program efficiency at the end, you can also write it like this.
Package com. Loader
{
/**
* Class for loading SWF files.
* A series of methods are provided to intercept the spliced SWF into bytearray data and then
* Distribute to ascode objects.
*/

Import com. display. displayloader;
Import com. Events. itemloadevent;
Import com. Events. swfinfocomplelateevent;
Import com.vow.vo;
Import flash. Events. eventdispatcher;
Public class swfloader extends eventdispatcher
{
Import flash. display. loader;
Import flash. display. Sprite;
Import flash. Events. event;
Import flash.net. URLRequest;
Import flash.net. urlstream;
Import flash. utils. bytearray;
Import com. Events. swfinfocomplelateevent;
Import com.vow.vo;

Public Function swfloader ()
{
Addeventlistener (swfinfocomplelateevent. load_complelate_dates, oncompletehandler );
}

/**
* Loading completed
* Serialize the object with the VO object.
*
*/

Public var VO: pdfvo;
Private function oncompletehandler (E: swfinfocomplelateevent): void
{E.tar get. removeeventlistener (swfinfocomplelateevent. load_complelate_dates, oncompletehandler );
Vo = new pdfvo (E. obj. Parent );
VaR Leng: Number = E. obj. position. Length-1;
For (var I: Int = 0; I <Leng; I ++)
{
VaR bytearray: bytearray=e.tar get. getstreampostiondatas (E. obj. Parent, E. obj. position. Start, E. obj. position. End );
VaR DS isplayloader = new displayloader (bytearray );
Boxes. Push (DS );
}
If (boxes [0]! = NULL) This. dispatchevent (New itemloadevent (itemloadevent. item_ OK, boxes ));
Return;
}

/**
* Loading Method
*
*/

Private var path: URLRequest;
Public Function load (Source: string): void
{
If (Stream) Close ();
Stream = new urlstream ();
Path = new URLRequest (source );
Try
{
Stream. Load (PATH );
}
Catch (E: Error)
{
Trace ("error ");
}
Stream. addeventlistener (event. Complete, completehandler );
}

/**
* Obtain the displayloader object at the index position.
*/

Private var boxes: array = [];
Public Function getdisplayyitemat (Index: INT) isplayloader
{
If (boxes [Index] = undefined) return NULL;
Return Boxes [Index];
}

/**
* Stream Loading completed
*/
Private var swfinfobox: array = [];
Private function completehandler (E: Event): void
{
Stream. removeeventlistener (event. Complete, completehandler );
VaR arr: bytearray = new bytearray ();
Stream. readbytes (ARR, 0, stream. bytesavailable );
VaR arrspeed: bytearray = arr;
Swfinfobox = splitswfinfo (ARR );
VaR OBJ: Object = {
Stream: stream,
Parent: arr,
Alsobyte: arrspeed,
Position: swfinfobox
}
VaR EVT: swfinfocomplelateevent = new swfinfocomplelateevent ("done", OBJ );
This. dispatchevent (EVT );
}

Private var stream: urlstream;

/**
* Obtain a piece of data
*/
Public Function getstreampostiondatas (Source: bytearray, index: Number, end: Number): bytearray
{
VaR temp: bytearray = new bytearray ();
VaR J: Number = index;
VaR length: Number = end-index;
While (j <= end)
{
Temp [J-Index] = source [J];
J ++;
}

Return temp;
}

/**
* Obtain the specified point information.
*/
Public Function getswfpostioncollection (parent: bytearray): Array
{
VaR swfinfobox: array = splitswfinfo (parent );
Return swfinfobox;
}

/**
* Truncate SWF information (data)
*
*/
Public Function splitswfinfo (parent: bytearray): Array
{
VaR Leng: Number = parent. length;
VaR endindex: array = [];
VaR startindex: array = [];
For (var I: Number = 0; I <= Leng; I ++)
{
If (parent = 0x43 & parent [I + 1] = 0x57 & parent [I + 2] = 0x53)
{
Startindex. Push (I );
Endindex. Push (I-1 );
}
}
VaR swfbox: array = [];
VaR Len: Number = startindex. length;
For (var j: Int = 0; j <= Len; j ++)
{
J = len-1? Swfbox [J] = {start: startindex [J], end: Leng }:
Swfbox [J] = {start: startindex [J], end: endindex [J + 1]};
}
Return swfbox;
}

/**
* Close Method
*/
Public Function close (): void
{
Stream. Close ();
Stream = NULL;
}
}
}

In fact, I don't need to write that. I spent half a day loading data. It may have been written in Flash ide at the time, so it was messy. Now I think it is not helpful. It is different to use urlloader. Much less effort, but be sure to set the data format loader. dataformat = urlloaderdataformat. binary; then, after reading the scriptures, find the file containing several CWS characters. split the SWF data. finally, put the split data in a data set or a large array, and use a loader to display them.

 

This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/wkyb608/archive/2010/11/23/6028968.aspx

 

 

There is also a friend's answer on the Internet:

Baidu reader loads the SWF document code and finds that Baidu prints every page of the document, and then combines every 5 or 10 pages to add some required parameters in the header, use CWS, FWS, and version numbers to split the streams on each page from the data stream, and then load and display them. Reader is actually very simple, generate some blank pages by page number, which is also a blank page that shows the container for loading SWF.

 

During document conversion, the application program converts the file by using C #. The Office Web TXT file is converted from PDF to SwF, and the most abundant page of the image is extracted and printed as an image, as a Web page, it solves the problem of PDF conversion garbled characters. Extracted text from the document and output it to search engines to improve search engine indexing.

The Web performance is achieved. The client reader uses Baidu. I didn't develop it myself. Loading is based on the combination of 5 or 10 pages requested by the client.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.