(1) Support breakpoint down-pass function
(2) Support both FTP and HTTP download mode
(3) Support Proxy Server
(4) Convenient and simple setting
(5) Program small, completely free
?
Although the wget is powerful, but it is relatively simple to use, the basic syntax is: wget [parameter list] URL. The following is a combination of specific examples to illustrate the use of wget.
1, download the entire HTTP or FTP site.
wget Http://place.your.url/here
This command can be downloaded Http://place.your.url/here home page. Using-X forces the creation of an identical directory on the server, and if you use the-nd parameter, all content downloaded from the server will be added to the local current directory.
Wget-r Http://place.your.url/here
This command will follow the recursive method of downloading all the directories and files on the server, essentially downloading the entire site. This command must be used carefully, because at the time of downloading, all the addresses pointed to by the downloaded Web site will also be downloaded, so if the site references other sites, the referenced sites will also be downloaded. For this reason, this parameter is not commonly used. You can use the-l number parameter to specify the level of the download. For example, download only two tiers, then use-l 2.
If you want to make a mirrored site, you can use the-m parameter, for example: Wget-m http://place.your.url/here
At this point wget will automatically determine the appropriate parameters to make the mirror site. At this point, wget will log on to the server, read the robots.txt and follow Robots.txt's rules.
2, the breakpoint continues to pass.
When the file is particularly large or the network is particularly slow, often a file has not been downloaded, the connection has been cut off, this time need to continue to pass the breakpoint. Wget's breakpoint continuation is automatic and requires only the-c parameter, for example:
Wget-c Http://the.url.of/incomplete/file
Using a breakpoint continuation requires the server to support breakpoint continuation. The-t parameter indicates the number of retries, such as the need to retry 100 times, then write-T-100, if set to-T 0, then an infinite retry until the connection succeeds. The-t parameter indicates a timeout wait time, such as-t 120, which means that waiting for a 120-second connection is not even timed out.
3, Bulk download.
If you have more than one file to download, you can generate a file that writes a single line of each file's URL, such as generating file Download.txt, and then using the command: Wget-i download.txt
This will download every URL listed in the Download.txt. (If the column is a file to download the file, if the column is a website, then download the home page)
4, optional download.
You can specify that you want wget to download only a class of files, or not to download any files. For example:
Wget-m–reject=gif http://target.web.site/subdirectory
Indicates download http://target.web.site/subdirectory, but ignores GIF files. –accept=list acceptable file types, –reject=list file types that are rejected.
5, Password and authentication.
Wget can only handle Web sites that restrict access by user name/password, and two parameters are available:
–http-user=user Set HTTP User
–http-passwd=pass Set HTTP Password
For Web sites that require certification, you can use only other download tools, such as curl.
6, the use of Proxy server to download.
If a user's network needs to go through a proxy server, you can have wget download the file through a proxy server. You need to create a. wgetrc file in the current user's directory. The proxy server can be set in the file:
Http-proxy = 111.111.111.111:8080
Ftp-proxy = 111.111.111.111:8080
Represents the HTTP proxy server and the FTP proxy server, respectively. If the proxy server requires a password, use:
–proxy-user=user Set Proxy User
–proxy-passwd=pass Set Proxy password
These two parameters.
Use the parameter –proxy=on/off to use or close the agent.
Wget also has a lot of useful features that require users to dig.
Appendix:
Command format:
wget [parameter list] [target software, web site]
-v,–version Displays the software version number and exits;
-H,–HELP display software help information;
-e,–execute=command executes a ". Wgetrc" command
-o,–output-file=file to save the software output information to the file;
-a,–append-output=file to append the software output information to the file;
-d,–debug display output information;
-q,–quiet does not display output information;
-i,–input-file=file the URL from the file;
-t,–tries=number whether the download count (0 indicates infinite)
-o–output-document=file download file as a different file name
-nc,–no-clobber do not overwrite files that already exist
-n,–timestamping only downloads newer files than local
-t,–timeout=seconds Set timeout time
-y,–proxy=on/off Shutdown Agent
-nd,–no-directories does not create a directory
-x,–force-directories Forced Directory Creation
–http-user=user Set HTTP User
–http-passwd=pass Set HTTP Password
–proxy-user=user Set Proxy User
–proxy-passwd=pass Set Proxy password
-r,–recursive download entire Web site, directory (use caution)
-l,–level=number Download level
File types that-a,–accept=list can accept
-r,–reject=list file types rejected by
-d,–domains=list can accept the domain name
–exclude-domains=list denied domain name
-l,–relative Download Associated Links
–FOLLOW-FTP Download FTP link only
-h,–span-hosts can download the host outside
-i,–include-directories=list allowed Directory
-x,–exclude-directories=list denied Directory
Wget is a very useful command to download online resources under Linux.
Wget are used in the form of:
wget [argument list] URL
First, let's introduce the main parameters of wget:
· -B: Let Wget run in the background, record files written in the current directory "Wget-log" file;
· -T [Nuber of times]: number of attempts to try to connect when wget cannot establish a connection to the server
。 such as "-t
120″ said to try 120 times. When this is "0″", it is useful to specify that you want to try infinity multiple times until the connection is successful, so that when the other server suddenly shuts down or the network is suddenly disconnected, you can resume downloading the files that have not been passed after it resumes normal;
·
-C: The breakpoint continues to pass, which is also a very useful setting, especially when loading larger files, if the
Accidental interruption of the path, then the connection recovery will be from the last time did not pass through the place to continue to pass, rather than start from scratch, so
Use this to require remote server also support breakpoint continuation, generally speaking, based on Unix/linux web/ftp server
All support the continuation of the breakpoint;
· -T [number of seconds]: timeout time, specify how long the remote server will disconnect without responding
And start the next attempt. For example, "-t 120″ indicates that if the remote server does not send data over 120 seconds later, it will retry the connection." If the network speed is relatively fast, this time can be set shorter, on the contrary, can be set longer, generally not more than 900, usually not less than 60, generally set in 120 or so more appropriate;
· -W [Number of seconds]: how many seconds to wait between two attempts, such as "-w 100″ to wait 100 seconds between two attempts;"
· -y On/off: Connecting via/not through Proxy server;
· -Q [byetes]: Limit the total size of the download file can not exceed how much, such as "-q2k" means not more than 2K bytes, "-q3m" means that the maximum can not exceed 3M bytes, if nothing after the number is not added, it is in bytes, such as "-q200″ Represents a maximum of 200 bytes;
· -nd: Do not download the directory structure, the download from the server all the specified directory files are heap into the current directory;
· -X: In contrast to the "-nd" setting, creating a complete directory structure, such as "Wget-nd http://www.gnu.org", creates a "www.gnu.org" subdirectory in the current directory and then builds at the level of the server's actual directory structure. Until all the documents have been handed out;
· -NH: Do not create a directory with the destination host domain name directory, the target host directory structure directly down to the current directory;
· –http-user=username
· –http-passwd=password: If the Web server needs to specify username and password, use these two to set;
· –proxy-user=username
· –proxy-passwd=password: If the proxy server needs to enter a username and password, use both options;
· -R: Set up server-side directory structure on this machine;
· -l [Depth]: Downloads the depth of the remote server directory structure, for example, "-l 5″ Download directory structure or file with a depth less than or equal to 5;"
· -M: When doing site mirroring options, if you want to do a site mirroring, use this option, it will automatically set other appropriate options to facilitate site mirroring;
· -NP: Downloads only the contents of the specified directory and its subdirectories for the target site. This is also a very useful option, we assume that someone's personal home page has a link to someone else's personal home page, and we just want to download the person's personal homepage, if not set this option, or even – it is possible to capture the entire site, which is clearly
What we usually don't want;
• How to set the proxy server used by wget
Wget
You can use the user settings file ". Wgetrc" to read a lot of settings, we mainly use this file to be
Set up a proxy server. What user is logged in with, then what is the ". Wgetrc" file in the Householder directory
Role. For example, "root" if you want to use ". Wgetrc" to set up a proxy server, "/root/.wgert"
, the following gives the contents of a ". WGE trc" file that readers can write their own "wgetrc" file with reference to this example:
Http-proxy = 111.111.111.111:8080
Ftp-proxy = 111.111.111.111:8080
The meaning of these two lines is that the proxy server IP address is: 111.111.111.111, and the port number is: 80. The first line specifies
The proxy server used by the HTTP protocol, and the second line specifies the proxy server used by the FTP protocol.