I. MAIN FEATURES AND FEATURES OF wget:
Supports resumable download.
Both FTP and HTTP download modes are supported.
Support proxy servers
Easy to set
ProgramSmall, completely free
Ii. Use
Basic Syntax:
Wget [ARGs] URL
Example:
1. Download the homepage of the site (to the current directory,./index.html ). -X will force the same directory (./www.baidu.com/index.html) on the server ).
Wget http://www.baidu.com
2. recursively downloads all directories and files on the server. this command must be used with caution, because during download, all the addresses pointed to by the downloaded website will be downloaded (it may be in an endless loop )! You can use the-l number parameter to specify the download level. For example, all webpages under the downloaded directory are as follows.
Http://www.kernel.org/doc/man-pages/online/pages/man3/ wget-R-l 1
3. Create an image site (the website's elements such as HTML, images, and Flash will be crawled). You can use the-M parameter.
Wget-M http://www.guimp.com/
At this time, wget will automatically determine the appropriate parameters to create an image site. Then, wgetwill be uploaded to the server and read to robots.txtand executed according to robots.txt.
4. resumable upload.
When the file size is very large or the network speed is very slow, the connection is often cut off before the file is downloaded. In this case, resumable data transfer is required. The resumable upload of wget is automatic. You only need to use the-C parameter, for example:
Wget-C http://www.ubuntu.com/start-download? Distro = desktop & bits = 32 & release = Latest
Resumable data transfer requires the server to support resumable data transfer. The-t parameter indicates the number of retries. For example, if you need to retry 100 times, write-t 100. If it is set to-T 0, it indicates an infinite number of retries until the connection is successful. The-t parameter indicates the timeout wait time, for example,-T 120, indicating that a timeout occurs even if the connection fails for 120 seconds.
5. Batch download.
If multiple files are downloaded, you can write the URLs of each file and generate a file named url.txt. Then run the following command:
Wget-I url.txt
This will download all URLs listed in url.txt. (If the column is a file, download the file. If the column is a website, download the homepage)
6. Selective download.
You can specify that wget only downloads one type of files, or does not download any files. For example:
Wget-m-reject = GIF http:// Www.baidu.com
Indicates that the GIF file is ignored. -Accept = list acceptable file types,-reject = List reject accepted file types.
7. Password and authentication.
Wget can only process websites restricted by user name/password. Two parameters can be used:
-Http-user = User: Set the HTTP user
-Http-passwd = pass: Set the HTTP Password
For websites that require certificate authentication, you can only use other download tools, such as curl
8. Use the proxy server for download.
If your network needs to go through the proxy server, you can have wget download files through the proxy server. Create a. wgetrc file in the current user directory. You can set the corresponding proxy server in the file:
HTTP-proxy = 111.111.111.111: 8080
FTP-proxy = 111.111.111.111: 8080
If the proxy server requires a password, use:
-Proxy-user = User: sets the proxy user.
-Proxy-passwd = pass: sets the proxy password.
Use the-proxy = on/off parameter to use or disable the proxy.
Iii. Some parameter information (see man wget for details)
-V,-version: displays the software version number and exits;
-H and-help show the software help information;
-E,-Execute = command to execute a ". wgetrc" command
-O,-output-file = file: Save the software output information to the file;
-A,-append-output = file: append the software output information to the file;
-D and-Debug display the output information;
-Q,-Quiet does not display output information;
-I,-input-file = file: Get the URL from the file;
-T,-tries = number indicates the number of downloads (0 indicates infinite times)
-O-output-document = file: The downloaded file is saved as another file name.
-NC,-no-clobber do not overwrite existing files
-N,-timestamping only downloads new files than local
-T,-Timeout = seconds: Set the timeout time.
-Y,-proxy = On/Off disable proxy
-Nd,-no-directories do not create a directory
-X,-force-directories force Directory Creation
-Http-user = User: Set the HTTP user
-Http-passwd = pass: Set the HTTP Password
-Proxy-user = User: sets the proxy user.
-Proxy-passwd = pass: sets the proxy password.
-R,-recursive download the entire website and directory (use it with caution)
-L,-level = Number download level
-A,-accept = list acceptable file types
-R,-reject = the type of the file that the List rejects.
-D,-domains = list acceptable domain names
-Exclude-domains = List rejected domain names
-L,-relative download link
-Follow-FTP: only download the FTP Link
-H,-span-hosts can download external hosts
-I,-include-directories = List Directory
-X,-exclude-directories = List reject directory