Simple webpage Download Method

Source: Internet
Author: User

 

The following command wget-r-p-K-NP http://hi.baidu.com/phps can be used,

-R indicates recursive download. All links are downloaded.
However, do not use this parameter separately, because if the website you want to download has other website links.
Wget will also download things from other websites. Due to the characteristics of the Internet, it is very likely that you will download the entire internet,
Therefore, the-NP parameter is added, indicating that the link of other sites is not downloaded.
-K indicates modifying the link in the downloaded webpage to a local link.
-P is used to obtain the elements required to display the webpage, such as slice or something.

 

Other parameters can be used:

 

-C indicates resumable upload.

-T 100 indicates 100 retries, and-T 0 indicates infinite retries.

In addition, you can write the URL to be downloaded to a file. Each URL contains a line. Run the wget-I download.txt command.

 

-- Reject = Avi, rmvb indicates that files of AVI and rmvb are not downloaded. -- accept = JPG, JPEG indicates that only jpg and JPEG files are downloaded.

You can create. wgetrc files (Windows does not seem to be able to directly create such files, Windows will think there is no file name --), which contains http-proxy = 123.456.78.9: 80, then add the -- proxy = on parameter. If you need a password, add the preceding parameters -- proxy-user = username, -- proxy-passwd = password.

Now many websites become intelligent, such as http://www.w3schools.com/html/default.asp can no longer use a lot of people commonly used:

 

Wget-r-p-NP-K-l INF

Download the entire website.
One important reason is that the User-Agent set by wget, such as my wget 1.10.2

Http_user_agent = wget/1.10.2

 

The wget version is different, and the number after "/" is also changed.
Many websites filter out wget requests.

It's easy to deal with this problem. Just add a-u User-Agent, for example, my default K-meleon User-Agent:

 

Mozilla/5.0 (windows; U; Windows NT 5.2; en-US; RV: 1.7.13) Gecko/20050610 K-meleon/0.9

Or IE6 in XP:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)

Or opera or something:

Opera/7.54 (Windows NT 5.1; U) [En]

 

In this way, you can download:

Wget-r-p-NP-K-l INF/-U "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"/http://www.w3schools.com/html/default.asp

You can also adjust the following parameters:

Wget-n-R-l inf -- no-Remove-listing-K-p-np/-U "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"/http://www.w3schools.com/html/default.asp

Or abbreviated

Wget-m-K-p-NP-U "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"/http://www.w3schools.com/html/default.asp

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.