Author: Ma Jian
Email: stronghorse@tom.com
Home: http://stronghorse.yeah.net
Version: 1.01
Initial Release Date: 2005.08.29
Last Updated: 2005.09.28
Directory
I. Preface
2. General steps for getting files from an ebook or webpage
3. Obtain the linked css file from an ebook or webpage
4. Obtain the linked js file from an ebook or webpage
5. Obtain Flash files from an ebook or webpage
6. obtain background music files from an ebook or webpage
7. obtain image files from ebooks
8. Go to the frame page
9. Other problems
I. Preface
Recently, I have heard people complain that using miniKillEBook to decompile e-books can only get HTML web pages, but cannot get js, css, Flash, background music and other files in e-books. In fact, as long as you know some JavaScript code, even if you only use the publicly published CtrlN, it is not very difficult to get these files in the e Book.
Statement in advance:
1. All the methods below are based on JavaScript, so they may feel like a circle, and the effect cannot be compared with IECracker and KillEBook that directly call the IE non-public interface. However, this is just a balance: for good scholars who intend to learn from others' book experience through decompilation, using JavaScript itself is a practice process, and this method is hard to be used for batch decompilation, therefore, books do not have to worry too much. I don't do anything about killing chickens and getting eggs. Haha ......
2. For ease of use, the JavaScript provided below is a dumb, and all URL analysis work is handed over to the code. You only need to press Ctrl + C and Ctrl + V. However, automated operations have limits. For most web pages, you can use these Code. However, if you encounter an indefinite web page, you still need to manually analyze the HTML code. If you encounter an encrypted webpage during analysis, you can use CtrlN's "HTML segment" function to decode the encrypted HTML. You can use the search function to quickly locate links in the source code.
3. Currently, e-books based on the IE kernel are basically implemented through custom protocol plug-ins, with different levels of support for JavaScript protocol plug-ins. Therefore, do not be surprised if the code is incorrect on some e-books.
4. In addition to decompiling e-books, these codes are also useful when Browsing normal webpages, for example, to capture Falsh files in webpages.
5. All code is tested in Windows XP SP2. I did not try it in other environments, but it is estimated that the IE version cannot be lower than 6.0.
6. All codes are original to me and can be used for free by individuals. Please obtain my authorization first for reprinting websites and commercial profits.
2. General steps for getting files from an ebook or webpage
The steps for getting various files from an e-book or an ordinary webpage are basically the same, but the JavaScript code to be entered is different:
- Start CtrlN. This is to prevent shortcuts from being disabled on e-books or web pages. If you confirm that the shortcut key is not disabled, skip this step and press Ctrl + N in step 1.
- Open an e-book or Internet Explorer and reference the css, js, Flash, and other files to be crawled. Note that the page must be a real page, not a frame. Next we will talk about how to determine the frame and how to enter the frame page.
- Set the "shortcut behavior" of CtrlN to "pop up a new window", click the page to be crawled, and press Ctrl + N to bring up a new IE window, the content displayed in the address bar is the same as the content of the page to be crawled.
- In the pop-up IE window, copy and paste the corresponding JavaScript code (which will be provided later) to the address bar and press Enter.
For IE 6, when JavaScript code is run for the first time, a yellow bar may pop up in the address bar, prompting the code to be blocked. Click the yellow bar and select "blocked content ", repeat steps 3 and 4 to see the result.
3. Obtain the linked css file from an ebook or webpage
JavaScript itself provides an interface for obtaining the content of an external css file. Therefore, in step 4 of the preceding general steps, copy and paste the following content to the IE Address Bar, and press enter to view the content:
Javascript: str = ''; c = document. styleSheets; for (I = 0; I <c. length; I ++) {o = c [I]; if (o. href = '') continue; str + = '=========='; str + = o. href; str + = '<br> <xmp> \ n'{str}o.css Text; str + =' </xmp> <br> \ n';}; document. write (str );
If the current HTML page is not linked to an external css file, no response or an empty page is displayed after Step 4 is complete. Check the HTML source code of the page for confirmation. If multiple css files are linked to the current page, the content of all css files will be displayed. The format may be different from the original css code after IE formatting, but the effect is absolutely the same. If only the file name of the css file is displayed and there is no content below, it means that the e book did not pack the css file.
For some ebook, you can also try the following code:
Javascript: str = '<HTML> <HEAD> <base href = "'; str + = document. URL; str + = '"> </HEAD> <BODY> <br> \ n'; c = document. styleSheets; for (I = 0; I <c. length; I ++) {o = c [I]; if (o. src = '') continue; str + = '<a href ="'; str + = o. href; str + = '">'; str + = o. href; str + = '</a> <br> \ n'; }; str + =' </BODY> </HTML> '; document. write (str );
This code automatically checks the webpage. If a css file is embedded in the webpage, the download link of the css file is automatically displayed. Otherwise, an empty page or no response is displayed. Right-click the link and select "Save as target" to save the file to the hard disk. If the file cannot be saved, copy the URL of the js file to the address bar and press Enter. However, if the registry item HKEY_CLASSES_ROOT \ CSSfile \ shell contains sub-items such as open and edit, the obtained css code will be opened directly in the program specified by the open or edit sub-item instead of prompting to store the disk. The applicability of this method is far from the method shown above. Not all e-books can be used, but as long as they can be used, the original css code is obtained.
4. Obtain the linked js file from an ebook or webpage
JavaScript does not provide an interface for obtaining js file content. Therefore, you must first rebuild the Registry: Run regedit and locate HKEY_CLASSES_ROOT \. js. Add two string-type values below it:
Content Type = application/x-javascript
PerceivedType = text
If you are not at ease with the modification, refer to the default settings of HKEY_CLASSES_ROOT \. css, which only have different Content Type values. Registry transformation is a one-time task. You do not need to change the registry.
After the transformation is complete, the steps for capturing js files with CtrlN are the same as the preceding general steps. In Step 4, copy and paste the following content to the address bar, and then press enter to view the content:
Javascript: str = '<HTML> <HEAD> <base href = "'; str + = document. URL; str + = '"> </HEAD> <BODY> <br> \ n'; c = document. scripts; for (I = 0; I <c. length; I ++) {o = c [I]; if (o. src = '') continue; str + = '<a href ="'; str + = o. src; str + = '">'; str + = o. src; str + = '</a> <br> \ n'; }; str + =' </BODY> </HTML> '; document. write (str );
This code automatically checks the webpage. If a js file is embedded in the webpage, the download link of the js file is automatically displayed. Otherwise, an empty page or no response is displayed. Right-click the link, select the "Save as target" menu, or click the link to save the file to the hard disk. If the file cannot be saved, check whether the registry has been set according to the method described above. If not, copy the URL of the js file to the address bar and press Enter.
The odd thing is that the eBook made with eBook Workshop (the page URL starts with ada99:), input the URL of the js file in the address bar, and press Enter, the js File Content and execution result are displayed. You need to click "View-> source file" to obtain the original js file code. However, these books are generally decompiled using unEbookWorkshop?
5. Obtain Flash files from an ebook or webpage
You can directly download an embedded object such as Flash. Therefore, in step 4 above, copy and paste the following content to the address bar, and press the Enter key to view the content:
Javascript: str = '<HTML> <HEAD> <base href = "'; str + = document. URL; str + = '"> </HEAD> <BODY> <br> \ n'; c = document. all; for (I = 0; I <c. length; I ++) {o = c [I]; if (o. tagName! = "OBJECT") continue; sih = o. innerHTML; nd = document. createDocumentFragment (); nd. appendChild (document. createElement ('<bod> </body>'); nd. firstChild. outerHTML = sih; no = document. createElement (nd. firstChild. outerHTML); document. body. appendChild (no); str + = '<a href = "'; str + = no. src; str + = '">'; str + = no. src; str + = '</a> <br> \ n'; }; str + =' </BODY> </HTML> '; document. write (str );
This code automatically checks the webpage. If a Flash Object is embedded in the webpage, the download link of the swf file is automatically displayed. Otherwise, an empty page or no response is displayed. Right-click the link and select "Save as target" to save the file to the hard disk. If you click the link directly, the Flash screen is displayed.
I often see people asking: "How to capture the beautiful Flash on the webpage ?", In fact, the answer is that simple. I often use this code to capture Flash on the Internet, but note: If the page is embedded in a frame, you need to break through the frame to enter the real page to use this code. In addition, this Code uses the createDocumentFragment method and can only be run on IE 6.
There is also an extreme e-book: the entire book has only one webpage, which is embedded with a Flash file as a directory. Click the link in Flash to transfer it to other Flash files, that is, the real content is hidden in a pile of Flash files. Then convert the file name to an absolute URL to generate a download link. For example, if the absolute URL of a Flash file is known to be http: // ebook/pic.swf, the following code can be used to download the file separately:
Javascript: document. write ('<a href = "http: // ebook/pic.swf"> right-click and save as </a> ');
This method changes the URL every time. Of course it is more difficult than the method mentioned above, but sometimes it is only possible to use this method. By the way, flasm is really a good thing. Some Flash files in the script can only play on the network, not from the local hard disk, you can also use it to remove this restriction.
6. obtain background music files from an ebook or webpage
The background music file can be downloaded directly like Flash. Therefore, in step 4 of the preceding general steps, copy and paste the following content to the address bar, and press the Enter key to view the content:
Javascript: str = '<HTML> <HEAD> <base href = "'; str + = document. URL; str + = '"> </HEAD> <BODY> <br> \ n'; c = document. all; for (I = 0; I <c. length; I ++) {o = c [I]; if (o. tagName! = "BGSOUND") continue; str + = '<a href = "'; str + = o. src; str + = '">'; str + = o. src; str + = '</a> <br> \ n'; }; str + =' </BODY> </HTML> '; document. write (str );
This code automatically checks the webpage. If the background music is embedded, the download link of the background music is automatically displayed. Otherwise, an empty page or no response is displayed. Right-click the link and select "Save as target" to save the file to the hard disk.
Note that the background music is usually hidden in the frame (otherwise, the page music is interrupted). If the pop-up page contains a frame, instead of a page that actually contains the background music link, it will not be captured. At this time, you also need to follow the steps below to enter the page in the frame.
In addition, some e Books pack several midi files at a time to avoid monotonicity, and randomly select one as the background music during each operation. For such an ebook, the above Code can only capture the current background music. If you want to capture all the content, you can only analyze the source code of the webpage, combine the URLs of all background music, enter the JavaScript code that generates the download link in the address bar, press enter, and download one at a time. Note that you can only right-click the download link and select "Save as target". You cannot click the link directly. If you do not have the ability to analyze the source code of a Web page, you can only run a few more times and capture a few more times. The so-called "lagging behind will be beaten ". Example: If the absolute URL of a music file is known to be http: // ebook/1.mid, the code for generating the download link is:
Javascript: document. write ('<a href = "http: // ebook/1.mid"> right-click and save as </a> ');
7. obtain image files from ebooks
In step 4 of the preceding general steps, copy and paste the following content to the address bar and press enter to view the content:
Javascript: z = 1; strUrl = ''; str =''; function getImg () {if (strUrl! = '') {Str + = (z ++); str + = '. <br> \ n' ;};}; c = document. images; for (I = 0; I <c. length; I ++) {o = c [I]; strUrl = o. src; getImg () ;}; strUrl = document. body. background; getImg (); c = document. all; for (I = 0; I <c. length; I ++) {o = c [I]; if (o. tagName = 'table' | o. tagName = 'td ') {strUrl = o. background; getImg () ;}; if (o. tagName = 'region') {strUrl = o. href; getImg () ;};}; document. write (str );
The above code displays all the images that can be found on the webpage in sequence. If you think there are too many images that are inconvenient to see, or some small images cannot be seen clearly, you can also use the following code to display the image link and click the link to display the image:
Javascript: z = 1; strUrl = ''; str =''; function getImg () {if (strUrl! = '') {Str + = (z ++); str + = '. <a href = "'; str + = strUrl; str + ='"> '; str + = strUrl; str + = '</a> <br> \ n' ;};}; c = document. images; for (I = 0; I <c. length; I ++) {o = c [I]; strUrl = o. src; getImg () ;}; strUrl = document. body. background; getImg (); c = document. all; for (I = 0; I <c. length; I ++) {o = c [I]; if (o. tagName = 'table' | o. tagName = 'td ') {strUrl = o. background; getImg () ;}; if (o. tagName = 'region') {strUrl = o. href; getImg () ;};}; document. write (str );
Due to code restrictions, images hidden in page js and css Code cannot be crawled using the above two sections of code. In this case, you can only manually analyze the HTML code, enter the absolute URL of the image in the address bar and press enter to display the image.
In addition, due to the limitations of the javascript protocol plug-in capabilities, the above two sections of Code do not remove duplicate links, so if you use these two sections of code to capture the diagram on the BBS page, don't be surprised when you see a bunch of identical images.
After an image or link is displayed, only a few pictures in the e-book can be saved as the original format. The vast majority can only get the pictures decoded into Bitmap. The method is "success" and you need to manually change the name. If the file specified in the URL is not bmp, but jpg, gif, or png, you also need to use software such as ACDSee to convert the saved bmp to the required format. Jpg is okay to say that the transparent color of gif and png needs to be manually processed, and the animated gif simply does not need to think about it.
Note: If you only change the name of the file but do not convert the file format, no image will be displayed in IE.
When surfing the Internet, you can also use the following code to capture the background image of the webpage you are browsing. In this case, select "Save image as", which is generally saved in the original format.
From the above description, we can see that, without using the internal interface of IE, image capturing may be the most troublesome, but the effect is the worst. I remember that I was so angry that I began to bite my teeth to analyze the source code of the IE kernel. Fortunately, I finally got a return. I don't know if there will be a lot of blood after I read the above instructions, and I am on that path? Hey, hey ......
8. Go to the frame page
All the JavaScript code given above is for the current page. That is to say, only the current page contains music files and Flash files can you capture the required files. If it is a frame, you must enter the page in the frame to capture it.
To check whether the current page is a frame, follow the preceding steps. In Step 4, copy and paste the following content to the address bar and press Enter:
Javascript: str = '<HTML> <HEAD> <base href = "'; str + = document. URL; str + = '"> </HEAD> <BODY> <br> \ n'; c = document. all; for (I = 0; I <c. length; I ++) {o = c [I]; if (o. tagName! = 'Iframe' & o. tagName! = 'Framework') continue; str + = o. tagName; str + = ': <a href = "'; str + = o. src; str + = '">'; if (o. name = '') str + = o. src; else str + = o. name; str + = '</a> <br> \ n'; }; str + =' </BODY> </HTML> '; document. write (str );
This code automatically checks the webpage. If a frame (including iframe) is embedded, the page Link in the frame is automatically displayed. Otherwise, an empty page or no response is displayed. Click the link to go to the corresponding page.
To ensure universality, the above Code only checks the first frame, which is not a problem for iframe, because there are not several normal people who will play nested iframe; but for normal frame, the possibility of nesting is still very high, and the above code needs to be clicked in layers to see the nested frame, which is a little troublesome. The solution is: if the above Code shows all frames without IFRAME, you can use the following code to display all nested frames:
Javascript: str = ''; function getFrame (c, I, j) {for (I = 0; I <c. length; I ++) {o = c [I]; for (k = 0; k <j; k ++) str + = ''; str + = '<a href = "'; str + = o. location; str + = '">'; if (o. name! = '') Str + = o. name; else str + = o. location; str + = '</a> <br> \ n'{no=o.doc ument. frames; if (no. length> 0) getFrame (no, 0, j + 1) };}; getFrame (document. frames, 0, 0); document. write (str );
This code automatically checks the webpage and displays the links and nesting relationships of all pages in the nested frame. If there is no frame, an empty page or no response is displayed. Click the link to go to the corresponding page. Note: If the page contains iframe, the above Code may fail, so you must first check whether there is iframe with the first code.
If the webpage uses js Code to inspect the frame, the webpage cannot run without the frame. To obtain the content embedded in the frame page, you can display the frame Page Link after using the code above, right-click the Page Link and choose "Save as target". Save the HTML code and then manually edit it or use tools such as TextForever to assist in editing.
The previous version of miniKillEBook v1.04 has an oversight: I simply thought about FRAME processing and forgot to handle IFRAME. Therefore, some people began to say that embedding a webpage into IFRAME, to avoid decompilation by miniKillEBook. After v1.04 came out, such a statement can only become a legend.
9. Other problems
Q: What if the pop-up IE window has no menu or address bar after pressing Ctrl + N?
A: Starting from CtrlN ver 1.03, A "Advanced Interface" that can be opened/closed is provided. Through the "Script command" function, you can directly push the JavaScript code to be executed to the IE window for execution without entering it in the address bar.
Q: What should I do if the copy and paste functions of Windows are disabled after the e-book starts up, And the js Code above is long and does not require a single character or one character?
A: Starting from CtrlN ver 1.03, A "Advanced Interface" that can be opened/closed is provided. Through the "Script command" function, you can directly push the JavaScript code or URL to the IE window for execution without entering it in the address bar. If you have written your own JavaScript code, you can add it to the CtrlN. spt file (plain text file). In this way, you can directly select it in the Script command selection window.
Appendix version update records
Version 1.01:
The document was revised according to the new features of CtrlN ver 1.03.