Wget command details, resumable Data Transfer

Source: Internet
Author: User
Tags ftp site ftp protocol

(1) supports resumable download.
(2) Both FTP and HTTP download modes are supported.
(3) Support for proxy servers
(4) easy to set
(5) small programs, completely free

?

Although wget is powerful, it is relatively simple to use. The basic syntax is: wget [parameter list] URL. The following uses a specific example to describe how to use wget.
1. Download the entire http or ftp site.

Wget http://place.your.url/here
This command can download the http://place.your.url/here home page. Using-x will force the creation of identical directories on the server. If the-nd parameter is used, all downloaded content on the server will be added to the local directory.

Wget-r http://place.your.url/here
This command uses recursive methods to download all directories and files on the server. The essence is to download the entire website. This command must be used with caution, because during download, all the addresses pointed to by the downloaded website
It will also be downloaded. Therefore, if this website references other websites, the referenced websites will also be downloaded! For this reason, this parameter is not commonly used. You can use-l
Number Parameter to specify the download level. For example, to download only two layers, use-l 2.

If you want to make an image site, you can use the-m parameter, for example, wget-m http://place.your.url/here
At this time, wget will automatically determine the appropriate parameters to create an image site. Then, wgetwill be uploaded to the server and read to robots.txtand executed according to robots.txt.

2. resumable upload.
When the file size is very large or the network speed is very slow, the connection is often cut off before the file is downloaded. In this case, resumable data transfer is required. The resumable upload of wget is automatic. You only need to use the-c parameter, for example:
Http://the.url.of/incomplete/file wget-c
Resumable data transfer requires the server to support resumable data transfer. The-t parameter indicates the number of retries. For example, if you need to retry 100 times, write-t 100. If it is set to-t 0, it indicates an infinite number of retries until the connection is successful. The-T parameter indicates the timeout wait time, for example,-T 120, indicating that a timeout occurs even if the connection fails for 120 seconds.

3. Batch download.
If multiple files are downloaded, you can generate a file, write the URL of each file in a line, such as the generated file download.txt, and then run the command wget-I download.txt.
This will download all the URLs listed in download.txt. (If the column is a file, download the file. If the column is a website, download the homepage)

4. Selective download.
You can specify that wget only downloads one type of files, or does not download any files. For example:
Wget-m-reject = gif http://target.web.site/subdirectory
Download http://target.web.site/subdirectory, but the GIF file is omitted. -Accept = LIST acceptable file types,-reject = LIST reject accepted file types.

5. Password and authentication.
Wget can only process websites restricted by user name/password. Two parameters can be used:
-Http-user = USER: Set the HTTP user
-Http-passwd = PASS: Set the HTTP Password
Websites that require certificate authentication can only use other download tools, such as curl.

6. Use the proxy server for download.
If your network needs to go through the proxy server, you can have wget download files through the proxy server. Create a. wgetrc file in the current user directory. You can set the proxy server in the file:
Http-proxy = 111.111.111.111: 8080
Ftp-proxy = 111.111.111.111: 8080
Indicates the http Proxy server and the ftp Proxy Server respectively. If the proxy server requires a password, use:
-Proxy-user = USER: sets the proxy user.
-Proxy-passwd = PASS: sets the proxy password.
These two parameters.
Use the-proxy = on/off parameter to use or disable the proxy.
Wget also has many useful functions that need to be mined by users.

Appendix:

Command Format:
Wget [parameter list] [target software and web site]

-V,-version: displays the software version number and exits;
-H and-help show the software help information;
-E,-execute = COMMAND to execute a ". wgetrc" COMMAND

-O,-output-file = FILE: Save the software output information to the file;
-A,-append-output = FILE: append the software output information to the FILE;
-D and-debug display the output information;
-Q,-quiet does not display output information;
-I,-input-file = FILE: Get the URL from the file;

-T,-tries = NUMBER indicates the NUMBER of downloads (0 indicates infinite times)
-O-output-document = FILE: The downloaded FILE is saved as another FILE name.
-Nc,-no-clobber do not overwrite existing files
-N,-timestamping only downloads new files than local
-T,-timeout = SECONDS: Set the timeout time.
-Y,-proxy = on/off disable proxy

-Nd,-no-directories do not create a directory
-X,-force-directories force Directory Creation

-Http-user = USER: Set the HTTP user
-Http-passwd = PASS: Set the HTTP Password
-Proxy-user = USER: sets the proxy user.
-Proxy-passwd = PASS: sets the proxy password.

-R,-recursive download the entire website and directory (use it with caution)
-L,-level = NUMBER download level

-A,-accept = LIST acceptable file types
-R,-reject = the type of the file that the LIST rejects.
-D,-domains = LIST acceptable domain names
-Exclude-domains = LIST rejected domain names
-L,-relative download link
-Follow-ftp: only download the FTP Link
-H,-span-hosts can download external hosts
-I,-include-directories = LIST Directory
-X,-exclude-directories = LIST reject directory

Wget is a useful command for downloading online resources in Linux.

Wget is used as follows:
Wget [parameter list] URL
First, we will introduce the main parameters of wget:
·-B: Run wget in the background. The record file is written in the "wget-log" file in the current directory;
·-T [nuber of times]: number of attempts. Number of attempts made when wget cannot establish a connection with the server
. For example, "-T
120 "indicates 120 attempts. When this parameter is set to "0", it is very useful to specify multiple attempts until the connection is successful. When the server of the other party is suddenly shut down or the network is suddenly interrupted, you can continue downloading files that have not been uploaded after they become normal;

·
-C: resumable upload, which is also a very useful setting. When downloading a large file
If the connection is accidentally interrupted, the connection will be resumed from the last point, instead of starting from scratch.
The remote server also supports resumable data transfer. Generally, the Web/FTP Server Based on Unix/Linux
Supports resumable data transfer;
·-T [number of seconds]: time-out period. It specifies the length of time for the connection to be interrupted when the remote server does not respond.
To start the next attempt. For example, "-T 120" indicates that if the remote server does not send data after 120 seconds, it will try again. If the network speed is faster, you can set a shorter time. On the contrary, you can set a longer time, generally up to 900, usually not less than 60, generally, it is suitable to set around 120;
·-W [number of seconds]: the number of seconds to wait between two attempts. For example, "-W 100" indicates that two attempts are waiting for 100 seconds;
·-Y on/off: connect through/without the proxy server;
·-Q [byetes]: Limit the maximum size of the downloaded file. For example, "-q2k" indicates that the size cannot exceed 2 kb, "-q3m" indicates that a maximum of 3 MB bytes can be entered. If no number is added, it indicates that a single byte is used. For example, "-q200" indicates that a maximum of 200 bytes can be entered;
·-Nd: Do not download the directory structure. heap all files downloaded from all specified directories on the server to the current directory;
--X: opposite to the "-nd" setting, creating a complete directory structure, for example, the "wget-nd http://www.gnu.org" will create the "www.gnu.org" subdirectory under the current directory, then, it is created at the first level according to the actual directory structure of the server until all the files are uploaded;
·-NH: do not create a directory with the target host domain name as the directory name. directly store the directory structure of the target host to the current directory;
·-Http-user = username
-Http-passwd = password: if the Web server needs to specify the user name and password, use these two parameters;
·-Proxy-user = username
·-Proxy-passwd = password: If the proxy server needs to enter the user name and password, use these two options;
·-R: Create a server directory structure on the local machine;
·-L [depth]: Download the depth of the remote server directory structure, for example, the "-l 5" Download directory depth is smaller than or equal to the directory structure or file within 5;
·-M: The site image option. If you want to create a site image, use this option to automatically set other appropriate options for site images;
·-Np: only download the contents of the specified directory and Its subdirectories of the target site. This is also a very useful option. Assume that a person's personal homepage has a connection pointing to another person's personal homepage on this site, and we only want to download this person's personal homepage, if you do not set this option, you may even capture the entire site. This is obviously
We usually don't want it;

Ü how to set the proxy server used by wget
Wget
You can use the user setting file ". wgetrc" to read many settings. Here we mainly use this file
Set the proxy server. The ". wgetrc" file in the user's main directory starts.
Function. For example, if the "root" user wants to use ". wgetrc" to set the proxy server, "/root/. wgert"
The following describes the content of the ". wge trc" file. You can refer to this example to compile your own "wgetrc" file:
HTTP-proxy = 111.111.111.111: 8080
FTP-proxy = 111.111.111.111: 8080
The two lines indicate that the Proxy Server IP address is 111.111.111.111 and the port number is 80. Specify
The proxy server used by the HTTP protocol. The second line specifies the proxy server used by the FTP protocol.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.