wget Introduction:
Linux system wget is a tool for downloading files, which is used on the command line. For Linux users is an essential tool, we often have to download some software or restore backup from a remote server to the local server. The wget supports HTTP,HTTPS and FTP protocols and can use HTTP proxies. The so-called automatic download means that the wget can be executed in the background after the user exits the system. This means that you can log in to the system, start a wget download task, and then exit the system, Wget will execute in the background until the task is complete.
One. Command format:
wget [parameters] [URL address]
Two. Function:
(1) Support the breakpoint down-pass function;
(2) Support both FTP and HTTP download mode;
(3) Support proxy server; For a system with high security intensity, it is generally not to expose its own system directly to the Internet, so the support agent is the necessary function to download the software;
(4) easy to set up;
(5) The procedure is small, completely free;
Three. Common syntax:
1. Download the entire HTTP or FTP site.
wget Http://place.your.url/here
This command can download the Http://place.your.url/here home page. Using-X forces a directory to be identical on the server, and if you use the-nd parameter, all content downloaded by the server is added to the local current directory.
Wget-r Http://place.your.url/here
This command will follow the recursive method of downloading all directories and files on the server, essentially downloading the entire site. This command must be used with caution, because at the time of download, all the addresses that the downloaded site points to are also downloaded, so if the site references other sites, the referenced sites will be downloaded as well! For this reason, this parameter is not commonly used. You can use the-l number parameter to specify the level of the download. For example, to download only two tiers, use-l 2.
If you want to create a mirror site, you can use the-m parameter, for example: Wget-m http://place.your.url/here
At this point wget will automatically determine the appropriate parameters to make the mirror site. At this point, wget will log on to the server, read into the robots.txt and follow the robots.txt rules.
2. The breakpoint continues to pass.
When the file is particularly large or the network is particularly slow, often a file has not been downloaded, the connection has been cut off, at this point, the need to continue to pass the breakpoint. Wget's breakpoint continuation is automatic and requires only the-c parameter, for example:
Wget-c Http://the.url.of/incomplete/file
Using a breakpoint to resume requires the server to support the continuation of the breakpoint. The-t parameter indicates the number of retries, such as the need to retry 100 times, then write-T 100, if set to-T 0, indicates an infinite retry until the connection succeeds. The-t parameter indicates a time-out wait, such as-t 120, which means that waiting for 120 seconds does not connect even if it times out.
3. Bulk download.
If you have more than one file to download, you can generate a file, write one line for each file URL, such as generate file Download.txt, and then use the command: Wget-i download.txt
This will download each URL listed in Download.txt. (If the column is a file to download the file, if the column is a site, then download the first page)
4. Optional Downloads.
You can specify that you want wget to download only one type of file, or not to download it. For example:
Wget-m–reject=gif http://target.web.site/subdirectory
Indicates that the http://target.web.site/subdirectory is downloaded, but the GIF file is ignored. –accept=list can accept the file type, –reject=list rejects the accepted file type.
5. Password and authentication.
Wget can only handle websites that restrict access using username/password, with two parameters:
–http-user=user setting up an HTTP user
–http-passwd=pass Setting the HTTP password
For sites that require certificates for certification, you can only use other download tools, such as curl.
6. Download using a proxy server.
If the user's network needs to go through a proxy server, then you can let wget through the proxy server for file download. At this point, you need to create a. wgetrc file in the current user's directory. You can set up a proxy server in the file:
Http-proxy = 111.111.111.111:8080
Ftp-proxy = 111.111.111.111:8080
Represents the proxy server for HTTP and the proxy server for FTP, respectively. If the proxy server requires a password, use:
–proxy-user=user setting up a proxy user
–proxy-passwd=pass Setting the proxy password
These two parameters.
Use the parameter –proxy=on/off or close the agent.
Four. Example:
Example 1: Downloading a single file using wget
Command: wget http://www.minjieren.com/wordpress-3.1-zh_CN.zip
command: Wget-o Wordpress.zip http://www.minjieren.com/download.aspx?id=1080
error: The following example downloads a file and saves it by name download.aspx?id=1080
wget http://www.minjieren.com/download?id=1
Even if the downloaded file is in the ZIP format, It is still with the download.php?id=1080 command.
correct: In order to solve this problem, we can use the parameter-o to specify a file name:
wget-o wordpress.zip http://www.minjieren.com/download.aspx?id=1080
Example 3: Download with wget–limit-rate speed limit
Command: wget--limit-rate=300k http://www.minjieren.com/wordpress-3.1-zh_CN.zip
Description: When you execute wget, it will use all possible broadband downloads by default. But when you're ready to download a large file and you still need to download other files, it's necessary to limit the speed.
Example 4: Using the Wget-c breakpoint to continue the transmission
Command: Wget-c http://www.minjieren.com/wordpress-3.1-zh_CN.zip
Description: Using wget-c to restart the download of interrupted files, it is very helpful for us to download large files suddenly because of the network interruption, we can continue to download instead of downloading a file again. You can use the-c parameter when you need to continue the interrupted download.
command: Wget-b/HTTP/ Www.minjieren.com/wordpress-3.1-zh_CN.zip
wget-b http://www.minjieren.com/wordpress-3.1-zh_CN.zip
Span style= "FONT-SIZE:14PX;" >continuing in background, PID 1840.
output'll be written to ' Wget-log '.
tail-f Wget-log
Example 6: Masquerading proxy name download
Command: wget--user-agent= "mozilla/5.0 (Windows; U Windows NT 6.1; En-US) applewebkit/534.16 (khtml, like Gecko) chrome/10.0.648.204 safari/534.16 "http://www.minjieren.com/ Wordpress-3.1-zh_cn.zip
Description: Some websites may reject your download request by judging the proxy name as not a browser. But you can disguise it by –user-agent parameters.
Example 7: Using the Wget–spider test download link
command: wget--spider URL
Description: When you plan to do a timed download, you should test the download link at the scheduled time to see if it is valid. We can increase the –spider parameter to check.
wget--spider URL
If the download link is correct, it will show
wget--spider URL
Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response ... OK
length:unspecified [text/html]
Remote file exists and could contain further links,
But recursion was disabled-not retrieving.
This ensures that the download can take place at the scheduled time, but when you give the wrong link, the following error will be displayed
wget--spider URL
Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response ... 404 Not Found
Remote file does not exist--broken link!!!
You can use the spider parameter in the following situations:
check before scheduled download
interval detect whether a site is available
Check for dead links on site pages
Example 8: Increase the number of retries with wget–tries
Command: wget--tries=40 URL
Description: It is also possible to fail if the network is having problems or downloading a large file. wget default retry 20 connection download file. If necessary, you can use –tries to increase the number of retries.
Example 9: Downloading multiple files using wget-i
Command: Wget-i filelist.txt
Description: First, save a copy of the download link file
Cat > Filelist.txt
Url1
Url2
Url3
Url4
Then use this file and parameters-I download
Example 10: Using Wget–mirror Mirror Web site
Command: wget--mirror-p--convert-links-p./local URL
Description: Download the entire website to local.
–miror: Account opening image download
-P: Download all files for HTML page to display normal
–convert-links: After download, convert cost to link
-P./local: Save all files and directories to a locally specified directory
Example 11: Use Wget–reject to filter the specified format for download
command: wget--reject=gif ur
Description: Download a website, but you do not want to download pictures, you can use the following commands.
Example 12: Use Wget-o to save download information to a log file
Command: Wget-o download.log URL
Description: Do not want the download information to be displayed directly in the terminal but in a log file that can be used
Example 13: Limit total download file size using Wget-q
Command: Wget-q5m-i filelist.txt
Description: When you want to download files over 5M and exit the download, you can use. Note: This parameter does not work for a single file download and is only valid for recursive downloads.
Example 14: Using Wget-r-A to download the specified format file
Command: Wget-r-a.pdf URL
Description: You can use this feature in the following situations:
Download all pictures of a website
Download all videos of a website
Download all PDF files for a website
Example 15: Download using wget FTP
Command: wget ftp-url
wget--ftp-user=username--ftp-password=password URL
Description: You can use wget to complete the download of the FTP link.
Using wget anonymous FTP download:
wget Ftp-url
FTP download with wget user name and password authentication
wget--ftp-user=username--ftp-password=password URL
Remarks: Compiling the installation
Compile the installation using the following command:
# tar ZXVF wget-1.9.1.tar.gz
# CD wget-1.9.1
#./configure
# make
# make Install
wget usage in Linux