Three major webpage problems resolved by webpage download programs (perfect download of images, JS, CSS, and webpage modification)

Source: Internet
Author: User

I took the time to write this article.ProgramThe function is to perfectly save the entire webpage, including: images, JS scripts, CSS styles, and modify the webpage source code for "localization ". Because I am from Mars, I don't even know that the browser comes with this function, so I did it myself. Although this program is not large, it involves three major problems (I will explain it in detail later ).

I didn't plan to release the source code. Since the browser has this function, I will release the source code for you to learn! The effect of this program is exactly the same as that of the browser! In addition, I have compared the JS, CSS, and images I have obtained. You can use this as a reference: Make a full-site download device. Of course, you cannot find the database .....

Instructions for use:

1. Fill in the webpage address and click "go". A one-click Download will be activated. loading the webpage takes time. If the page is not loaded, a prompt will be prompted for one-click Download. Try to use it when the network speed is better!

2. after the download is complete, a folder named after the webpage title is generated under the Software Directory. All required files are stored here. The HTM file named after the webpage title is the saved page, if there is no network, double-click to view the effect is the same as on the network !.

Program:

Problems solved:

1. judge that the webpage has been loaded. Previously, the target web page was known. You can use the flag method to judge, but everything in this program is unknown. Therefore, a new method must be used. Here, the onload event of the HTML object is used, And the webbrowser control is used to perfectly determine the completion of webpage loading. This is the safest, most accurate, and most reliable method at present! Applicable to All environments.


Judging the completion of webpage loading has always been a headache, at least in VB. The methods mentioned on the Internet are basically not good. The better thing is that sometimes and sometimes it is not good. Now I will postCodeTo end the problem.

'Reference "Microsoft HTML Object Library" dim withevents page as htmlwindow2'. Be sure to define the global private sub webbrowser1_navigatecomplete2 (byval Pdisp as object, URL as variant) set page = me.webbrowser1.doc ument. parentwindowend subprivate sub page_onload () debug. print "loaded" End sub

2. Get the JS and CSS pages.. I have seen a lot of people posting on the Internet asking for programs to get all JS and CSS on the webpage. In fact, this is not difficult. Baidu can find that the Javascript language provides this interface. The following shows how to use this interface.

First, use the webbrowser control to load the web page you want to extract.

Get JS:

Strbasichtm = webbrowser1.document.doc umentelement. outerhtmlwebbrowser1.navigate "javascript: Str = '<HTML>  

Result chart:

The webbrowser control of the above Code shows all JS paths on the webpage. This path is the path in the webpage source code without any modification. That is to say, if the source code is written with an absolute path, such as http:/example.

Finally, use a loop to get the JS path displayed in the webbrowser control (in fact, it is to get the hyperlink text)

Note that the strbasichtm variable in the above code is used to obtain the changes before and after the Javascript statement is executed in webbrowser to determine whether the execution is complete, making the program safer.

Get CSS:

The process of getting CSS is exactly the same as that of getting Js.

 
Webbrowser1.navigate "javascript: Str = '<HTML>  

Changed:

Webbrowser1.navigate "javascript: Str = '<HTML>  

Last note:

JavaScript does not directly provide an interface for obtaining JS file content. Therefore, you must first rebuild the Registry: Run regedit and locate hkey_classes_root \. js. Add two string types below it:

Content type = application/X-Javascript

Perceivedtype = text

If you are not at ease with the modification, refer to the default settings of hkey_classes_root \. CSS, which only have different content type values. Registry transformation is a one-time task. You do not need to change the registry.

The code is changed:

 
Dim lhwyset lhwy = Createobject ("wscript. shell ") lhwy. regwrite "hkey_classes_root \. JS \ content type "," application/X-JavaScript "lhwy. regwrite "hkey_classes_root \. JS \ perceivedtype "," text"

3. Determine the webpage code.First save in GB encoding, and then read, if and save before different, it is UTF-8 encoding, and then save UTF-8 encoding.

Save in GB encoding:

 
Open app. Path & "\ XX. htm" for output as #1 print #1, strhtmclose #1

Save with UTF-8 encoding:

'Reference Microsoft ActiveX Data Objects 2.8 librarydim objstream as new ADODB. streamdim STR as string with objstream. type = 2. mode = 3. open. charset = "UTF-8 ". writetext strhtm, adwriteline. savetofile app. path & "\ xxhtm", adsavecreateoverwrite. closeend

Program source code

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.