Transferred from: http://www.cnblogs.com/peida/archive/2013/03/18/2965369.html
The wget in a Linux system is a tool for downloading files, which is used at the command line. For Linux users is an essential tool, we often have to download some software or restore backup from a remote server to the local server. The wget supports HTTP,HTTPS and FTP protocols and can use HTTP proxies. The so-called automatic download means that the wget can be executed in the background after the user exits the system. This means that you can log in to the system, start a wget download task, and then exit the system, and wget will be executed in the background until the task is completed, which saves a lot of hassle when the user needs to be involved in downloading large amounts of data from most other browsers.
Wget can follow the links on the HTML page and download it to create a local version of the remote server, completely rebuilding the directory structure of the original site. This is often referred to as a "recursive download". In the recursive download, wget follows the robot Exclusion Standard (/robots.txt). wget can switch links to local files for offline browsing while downloading.
The wget is very stable, and it has a strong adaptability in the case of very narrow bandwidth and unstable networks. If the download fails because of the network, wget will keep trying until the entire file is downloaded. If the server interrupts the download process, it will again be linked to the server to continue downloading from where it stopped. This is useful for downloading large files from servers that have limited link times.
1. Command format:
wget [parameters] [URL address]
2. Command function:
Used to download resources from the network, without specifying directories, to download the resources back to default as the current directory. Wget, though powerful, is relatively simple to use:
1) Support the breakpoint down-pass function; This is also the network Ant and flashget the biggest selling point of the year, now, wget can also use this feature, those networks are not too good users can rest assured;
2) support both FTP and HTTP download mode, although most of the software can now be downloaded using HTTP, but, in some cases, still need to use FTP mode to download software;
3) Support proxy Server, for the security of high-intensity systems, generally do not expose their own systems directly on the Internet, so, the support agent is necessary to download software features;
4) Easy to set up, may be accustomed to the GUI user is not too accustomed to command line, but, command line in the settings in fact there are more advantages, at least, the mouse can be less points many times, do not worry about whether the mouse is wrong point;
5) The program is small, completely free, the program is small can be considered, because the current hard disk is too big, completely free to consider, even if there are many so-called free software, but these software ads are not our favorite.
3. Command parameters:
Startup parameters:
-v,–version show wget version and exit
-H,–HELP Print Syntax Help
-b,–background Boot to background execution
-e,–execute=command execute '. Wgetrc ' Format command, WGETRC format see/ETC/WGETRC or ~/.WGETRC
Record and input file parameters:
-o,–output-file=file Write records to file
-a,–append-output=file Append records to File
-d,–debug Print Debug output
-q,–quiet Quiet mode (no output)
-v,–verbose Verbose mode (this is the default setting)
-nv,–non-verbose Turn off verbose mode, but not quiet mode
-i,–input-file=file download URLs that appear in file files
-f,–force-html treats the input file as an HTML format file
-b,–base=url the URL as the relative link prefix that appears in the file specified by the-f-i parameter
–sslcertfile=file Optional Client certificate
–sslcertkey=keyfile Optional Client certificate keyfile
–EGD-FILE=FILE Specifies the file name of the EGD socket
Download parameters:
–bind-address=address specifies the local use address (host name or IP, used when there are multiple IPs or names locally)
-t,–tries=number sets the maximum number of attempts to link (0 means no limit).
-o–output-document=file writing documents to file files
-nc,–no-clobber do not overwrite existing files or use. #前缀
-c,–continue then download the files that are not downloaded
–progress=type Setting the Process bar flag
-n,–timestamping do not download files again unless newer than local files
-s,–server-response the print server response
–spider don't load anything.
-t,–timeout=seconds setting the number of seconds for response timeout
-w,–wait=seconds interval between two attempts SECONDS seconds
–waitretry=seconds wait between Relink 1 ... Seconds sec
–random-wait wait between downloads 0 ... 2*wait sec
-y,–proxy=on/off turning the agent on or off
-q,–quota=number setting capacity Limits for downloads
–limit-rate=rate Limit Download Transmission rate
Directory Parameters:
-nd–no-directories do not create a directory
-x,–force-directories forcing a directory to be created
-nh,–no-host-directories do not create a host directory
-p,–directory-prefix=prefix save file to directory prefix/...
–cut-dirs=number Ignore number layer remote directory
HTTP option Parameters:
–http-user=user set the HTTP username to user.
–http-passwd=pass set HTTP password to PASS
-c,–cache=on/off Allow/Disallow server-side data caching (typically allowed)
-e,–html-extension Save all text/html documents with an. html extension
–ignore-length Ignore ' content-length ' header fields
–header=string inserting strings in headers string
–proxy-user=user set the user name of the agent
–proxy-passwd=pass set the password for the agent to PASS
–referer=url include ' Referer:url ' header in HTTP request
-s,–save-headers saving HTTP headers to a file
-u,–user-agent=agent set the agent name as agent instead of Wget/version
–no-http-keep-alive Close HTTP Activity link (forever link)
–cookies=off Do not use cookies
–load-cookies=file loading a cookie from a file before starting a session
–save-cookies=file cookies are saved to the file after the session ends
FTP option Parameters:
-nr,–dont-remove-listing do not remove '. Listing ' files
-g,–glob=on/off globbing mechanism for opening or closing filenames
The –PASSIVE-FTP uses the passive transfer mode (the default value).
–active-ftp using active transfer mode
–retr-symlinks the link to the file (not the directory) at the time of recursion
Recursive download parameters:
-r,–recursive recursive download--use with caution!
-l,–level=number maximum recursion depth (INF or 0 for Infinity)
–delete-after Delete files locally after it is finished
-k,–convert-links Convert non-relative links to relative links
-k,–backup-converted back to X.orig before converting file X
-m,–mirror equivalent to-r-n-l INF-NR
-p,–page-requisites Download all pictures showing HTML files
Included and not included in the recursive Download (accept/reject):
-a,–accept=list a semicolon-delimited list of accepted extensions
-r,–reject=list semicolon-delimited list of non-accepted extensions
-d,–domains=list a semicolon-delimited list of accepted domains
–exclude-domains=list semicolon-delimited list of domains that are not accepted
–follow-ftp Tracking of FTP links in HTML documents
–follow-tags=list a semicolon-delimited list of tracked HTML tags
-g,–ignore-tags=list a semicolon-delimited list of ignored HTML tags
-h,–span-hosts go to external host when recursion
-l,–relative only tracks relative links
-i,–include-directories=list List of allowed directories
-x,–exclude-directories=list List of directories not included
-np,–no-parent don't go back to the parent directory
Wget-s–spider URL does not download only the display process
4. Usage examples:
Example 1: Downloading a single file using wget
Command:
wget Http://www.minjieren.com/wordpress-3.1-zh_CN.zip
Description
The following example is to download a file from the network and save in the current directory, the download process will show a progress bar, including (download complete percentage, the downloaded bytes, the current download speed, the remaining download time).
Example 2: Download with Wget-o and save with a different file name
Command:
: Wget-o wordpress.zip http://www.minjieren.com/download.aspx?id=1080
Description
wget defaults to the last character that matches the "/" and the file name for dynamically linked downloads is usually incorrect.
Error: The following example downloads a file and saves it by name download.aspx?id=1080
wget http://www.minjieren.com/download?id=1
Even if the downloaded file is in the zip format, it still takes the download.php?id=1080 command.
Correct: To solve this problem, we can use the parameter-o to specify a file name:
Wget-o Wordpress.zip http://www.minjieren.com/download.aspx?id=1080
Example 3: Download with wget–limit-rate speed limit
Command:
wget--limit-rate=300k Http://www.minjieren.com/wordpress-3.1-zh_CN.zip
Description
When you execute wget, it will use all possible broadband downloads by default. But when you're ready to download a large file and you still need to download other files, it's necessary to limit the speed.
Example 4: Using the Wget-c breakpoint to continue the transmission
Command:
Wget-c Http://www.minjieren.com/wordpress-3.1-zh_CN.zip
Description
Using wget-c to restart the download of interrupted files is very helpful when we download large files for reasons such as network interruption, we can continue downloading instead of downloading a file again. You can use the-c parameter when you need to continue the interrupted download.
Example 5: Using wget-b background download
Command:
Wget-b Http://www.minjieren.com/wordpress-3.1-zh_CN.zip
Description
For downloading very large files, we can use the parameter-B to download the background.
Wget-b Http://www.minjieren.com/wordpress-3.1-zh_CN.zip
Continuing in background, PID 1840.
Output'll is written to ' Wget-log '.
You can use the following command to view the download progress:
Tail-f Wget-log
Example 6: Masquerading proxy name download
Command:
wget--user-agent= "mozilla/5.0 (Windows; U Windows NT 6.1; En-US) applewebkit/534.16 (khtml, like Gecko) chrome/10.0.648.204 safari/534.16 "http://www.minjieren.com/ Wordpress-3.1-zh_cn.zip
Description
Some websites may reject your download request by judging the proxy name as not a browser. But you can disguise it by –user-agent parameters.
Example 7: Using the Wget–spider test download link
Command:
wget--spider URL
Description
When you plan to do a timed download, you should test the download link at the scheduled time to see if it is valid. We can increase the –spider parameter to check.
wget--spider URL
If the download link is correct, it will show
wget--spider URL
Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response ... OK
length:unspecified [text/html]
Remote file exists and could contain further links,
But recursion was disabled-not retrieving.
This ensures that the download can take place at the scheduled time, but when you give the wrong link, the following error will be displayed
wget--spider URL
Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response ... 404 Not Found
Remote file does not exist--broken link!!!
You can use the spider parameter in the following situations:
Check before scheduled download
Interval detect whether a site is available
Check for dead links on site pages
Example 8: Increase the number of retries with wget–tries
Command:
wget--tries=40 URL
Description
It is also possible to fail if the network is having problems or downloading a large file. wget default retry 20 connection download file. If necessary, you can use –tries to increase the number of retries.
Example 9: Downloading multiple files using wget-i
Command:
Wget-i filelist.txt
Description
First, save a copy of the download link file
Cat > Filelist.txt
Url1
Url2
Url3
Url4
Then use this file and parameters-I download
Example 10: Using Wget–mirror Mirror Web site
Command:
wget--mirror-p--convert-links-p./local URL
Description
Download the entire site to local.
–miror: Account opening image download
-P: Download all files for HTML page to display normal
–convert-links: After download, convert cost to link
-P./local: Save all files and directories to a locally specified directory
Example 11: Use Wget–reject to filter the specified format for download
Command:
wget--reject=gif ur
Description
To download a website, but you do not want to download pictures, you can use the following commands.
Example 12: Use Wget-o to save download information to a log file
Command:
Wget-o Download.log URL
Description
Do not want the download information to be displayed directly in the terminal but in a log file that can be used
Example 13: Limit total download file size using Wget-q
Command:
Wget-q5m-i filelist.txt
Description
When you want to download files over 5M and exit the download, you can use. Note: This parameter does not work for a single file download and is only valid for recursive downloads.
Example 14: Using Wget-r-A to download the specified format file
Command:
Wget-r-a.pdf URL
Description
You can use this feature in the following situations:
Download all pictures of a website
Download all videos of a website
Download all PDF files for a website
Example 15: Download using wget FTP
Command:
wget Ftp-url
wget--ftp-user=username--ftp-password=password URL
Description
You can use wget to complete the download of the FTP link.
Using wget anonymous FTP download:
wget Ftp-url
FTP download with wget user name and password authentication
wget--ftp-user=username--ftp-password=password URL
Remarks: Compiling the installation
Compile the installation using the following command:
# tar ZXVF wget-1.9.1.tar.gz
# CD wget-1.9.1
#./configure
# make
# make Install
Remote download command for linux command: wget