Wget usage 2

Source: Internet
Author: User
Tags ftp site
Wget is a frequently used command line download tool. Most Linux releases contain this tool by default. If not installed, you can download the latest version at http: // www.gnu.org/software/wget/wget.html, and use the command to compile the installation: # tar zxvf wget-1.9.1.tar.gz # cd wget-1.9.1 #./configure # Make # make install its usage is very simple.

1) Support for resumable data transfer. This is also the biggest selling point of network ant financial and flashget in the past. Now, wget can also use this function. users who are not good at the network can rest assured;
(2) FTP and HTTP download methods are supported at the same time. Although most software can be downloaded through HTTP, FTP download is still required in some cases;
(3) Support for proxy servers. For systems with high security, generally, their systems are not directly exposed on the Internet. Therefore, support for proxy is a required function for downloading software;
(4) easy to set; maybe, users who are used to the graphic interface are not too familiar with command line. However, the command line has more advantages in setting, at least, the mouse can be clicked many times, and do not worry if the mouse is wrong;
(5)ProgramSmall, completely free; small programs can be considered, because the current hard disk is too large; completely free, you have to consider, even if there are many so-called free software on the network,, advertisements for these software are not what we like;

Although wget is powerful, it is relatively simple to use. The basic syntax is: wget [parameter list] URL. The following uses a specific example to describe how to use wget.
1. Download the entire HTTP or FTP site.
Wget http://place.your.url/here
This command can download the http://place.your.url/here home page. Using-X will force the creation of identical directories on the server. If the-nd parameter is used, all downloaded content on the server will be added to the local directory.

Wget-r http://place.your.url/here
This command uses recursive methods to download all directories and files on the server. The essence is to download the entire website. This command must be used with caution, because during download, all the addresses pointed to by the downloaded website will also be downloaded. Therefore, if this website references other websites, the referenced website will also be downloaded! For this reason, this parameter is not commonly used. You can use the-l number parameter to specify the download level. For example, to download only two layers, use-L 2.

If you want to make an image site, you can use the-M parameter, for example, wget-M http://place.your.url/here
At this time, wget will automatically determine the appropriate parameters to create an image site. Then, wgetwill be uploaded to the server and read to robots.txtand executed according to robots.txt.

2. resumable upload.
When the file size is very large or the network speed is very slow, the connection is often cut off before the file is downloaded. In this case, resumable data transfer is required. The resumable upload of wget is automatic. You only need to use the-C parameter, for example:
Http://the.url.of/incomplete/file wget-C
Resumable data transfer requires the server to support resumable data transfer. The-t parameter indicates the number of retries. For example, if you need to retry 100 times, write-t 100. If it is set to-T 0, it indicates an infinite number of retries until the connection is successful. The-t parameter indicates the timeout wait time, for example,-T 120, indicating that a timeout occurs even if the connection fails for 120 seconds.

3. Batch download.
If multiple files are downloaded, you can generate a file, write the URL of each file in a line, such as the generated file download.txt, and then run the command wget-I download.txt.
This will download all the URLs listed in download.txt. (If the column is a file, download the file. If the column is a website, download the homepage)

4. Selective download.
You can specify that wget only downloads one type of files, or does not download any files. For example:
Wget-M -- reject = GIF http://target.web.site/subdirectory
Download http://target.web.site/subdirectory, but the GIF file is omitted. -- Accept = list acceptable file types, -- reject = List reject accepted file types.

5. Password and authentication.
Wget can only process websites restricted by user name/password. Two parameters can be used:
-- Http-user = User: Set the HTTP user
-- Http-passwd = pass: Set the HTTP Password
Websites that require certificate authentication can only use other download tools, such as curl.

6. Use the proxy server for download.
If your network needs to go through the proxy server, you can have wget download files through the proxy server. Create a. wgetrc file in the current user directory. You can set the proxy server in the file:
HTTP-proxy = 111.111.111.111: 8080
FTP-proxy = 111.111.111.111: 8080
Indicates the HTTP Proxy server and the FTP Proxy Server respectively. If the proxy server requires a password, use:
-- Proxy-user = User: Set proxy user
-- Proxy-passwd = pass: sets the proxy password.
These two parameters.
Use the -- proxy = on/off parameter to use or disable the proxy.
Wget also has many useful functions that need to be mined by users.

Example:
You can select the following parameters as needed:
$ Wget-c-r-Nd-NP-K-l-P-a c, H www.xxx.org/pub/path/

-C resumable upload
-R recursive download: Download all files in a directory (including subdirectories) of a specified webpage
-When recursive nd download is performed, all files are downloaded to the current directory without creating a directory at the same level.
-During recursive download of NP, no upper-level directories are searched, such as wget-c-r www.xxx.org/pub/path/
If the-NP parameter is not added, other files under the pub directory of the path will be downloaded at the same time.
-K converts an absolute link to a relative link. After downloading the entire site, it is best to add this parameter to browse the webpage offline.
-L do not enter other hosts in recursion, such as wget-c-r www.xxx.org/If the website has such a link:
Www.yyy.org, without the parameter-L, Will recursively download the www.yyy.org website like the Great Mountains
-P: download all the files and slices required for the webpage.
-A specifies the file style list to be downloaded. Multiple styles are separated by commas (,).
-I is followed by an object. The file specifies the URL to be downloaded.

Appendix:

Command Format:
Wget [parameter list] [target software and web site]

-V version information

-H help information

-Run wget in background B.

-O filename: place the record in the file filename

-A filename: append the record to the file filename.

-D: Display debugging information

-Q: No output Download Method

-V detailed screen output (default)

-NV simple screen output

-I inputfiles: Read address lists from text files

-F forcehtml: Read the address list from the HTML file

-T number times retry download (0 is unlimited)

-O output document file: Write a file to a file

-NC does not overwrite existing files

-C: resumable download

-N timestamp. This parameter specifies that wget only downloads updated files. That is to say, files with the same length as the last modification date in the local directory will not be downloaded.

-S: Display Server Response

-T timeout time-out setting (unit: seconds)

-W time retry delay (unit: seconds)

-Y proxy = On/Off whether to enable proxy

-Q quota = number of retries

Directory:

-Nd -- no-directories does not create a directory.

-X, -- force-directories force Directory Creation.

-NH, -- no-host-directories do not create a host directory.

-P, -- directory-Prefix = Prefix: Save the file to prefix /...

-- Cut-dirs = Number ignore the number of remote directory components.

Http options:

-- Http-user = User: Set the HTTP user to user.

-- Http0passwd = pass: Set the HTTP user's password to pass.

-C, -- cache = On/Off To provide/disable data on the Cache Server (normally ).

-- Ignore-length ignores the 'content-length' header field.

-- Proxy-user = User: Set User to the proxy user name.

-- Proxy-passwd = pass: Set pass as the proxy password.

-S, -- save-headers stores the HTTP header as a file.

-U, -- User-Agent = use agent to replace wget/version as the identification code.

FTP options:

-- Retr-symlinks: obtains the FTP symbolic link.

-G, -- glob = on/off turn file name globbing on ot off.

-- Passive-FTP uses the "passive" transmission mode.

Use the progressive retrieval method:

-R, -- Recursive is like the retrieval of inhaled Web -- please be careful !.

-L, -- level = maximum value of the number returned level (0 is not limited ).

-- Delete-after: deletes the downloaded file.

-K, -- convert-links changes unrelated links to related connections.

-M, -- mirror enables the ing options.

-Nr, -- Dont-Remove-listing do not remove the '. listing' file.

Options for permission and rejection of a progressive job:

-A, -- accept = List: List of allowed extended projects

.-R, -- reject = List the list of extended projects rejected.

-D, -- domains = List list of allowed domains.

-- Exclude-domains = List of rejected domains (separated by commas ).

-L, -- relative only follows the link.

-- Follow-FTP follows the FTP link in the HTML file.

-H, -- span-hosts will be handed back to the external host.

-I, -- include-directories = List: List of allowed directories.

-X, -- exclude-directories = List list of excluded directories.

-NH, -- no-host-lookup does not search hosts through DNS.

-NP, -- no-parent does not trace the origin directory.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.