Linux: wget usage

Last Update:2013-12-13 Source: Internet

Author: User

Tags ftp site

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Linux: wget usage wget is an open source software developed in Linux, written by Hrvoje Niksic, and then transplanted to various platforms including Windows. It has the following features and features: (1) supports resumable download. This is also the biggest selling point of the year for network ant financial and FlashGet. Currently, Wget can also use this feature, users with poor networks can rest assured. (2) FTP and HTTP download methods are supported at the same time. Although most software can be downloaded using HTTP, sometimes, you still need to use FTP to download software. (3) proxy servers are supported. For systems with high security, your systems are generally not directly exposed to the Internet, supporting proxy is a required function for downloading software; (4) easy to set up; possibly, users who are used to graphical interfaces are not too familiar with command line. However, in fact, the command line has more advantages in terms of settings. At least, you can click the mouse a few times, And do not worry about mouse errors. (5) The program is small and free of charge; the program is too small to be considered, because the hard disk is too large now. If it is completely free, you have to consider it. Even if there are many so-called free software on the network, the advertisements of these software are not what we like. Although wget is powerful, it is still used. Is relatively simple, the basic syntax is: wget [parameter list] URL. The following uses a specific example to describe how to use wget. 1. Download the entire http or ftp site. Wget http://place.your.url/here this command can download the http://place.your.url/here home page. Using-x will force the creation of identical directories on the server. If the-nd parameter is used, all downloaded content on the server will be added to the local directory. Wget-r http://place.your.url/here this command will follow the recursive method, download all the directories and files on the server, the essence is to download the entire site. This command must be used with caution, because during download, all the addresses pointed to by the downloaded website will also be downloaded. Therefore, if this website references other websites, the cited website will also be downloaded! For this reason, this parameter is not commonly used. You can use the-lnumber parameter to specify the download level. For example, to download only two layers, use-l 2. If you want to make an image site, you can use the-m parameter, for example: wget-m http://place.your.url/here at this time wget will automatically determine the appropriate parameters to make the image site. Then, wgetwill be uploaded to the server and read to robots.txtand executed according to robots.txt. 2. resumable upload. When the file size is very large or the network speed is very slow, the connection is often cut off before the file is downloaded. In this case, resumable data transfer is required. Wget resumable upload is automatic, just use the-c parameter, for example: wget-c http://the.url.of/incomplete/file using resumable upload requires the server to support resumable upload. The-t parameter indicates the number of retries. For example, if you need to retry 100 times, write-t 100. If it is set to-t 0, it indicates an infinite number of retries until the connection is successful. The-T parameter indicates the timeout wait time, for example,-T 120, indicating that a timeout occurs even if the connection fails for 120 seconds. For example: wget-t 0-w 31-c http://dsec.pku.edu.cn/BBC.avi-o down. log &-t 0: Infinite retry-w 31: interval between two attempts 31seconds-c: resumable upload 3. Batch download. If multiple files are downloaded, you can generate a file, write the URL of each file in a line, such as the generated file download.txt. Then run the command: wget-I download.txtto download each URL listed in download.txt. (If the column is a file, download the file, if the column is a website, then download the home page) 4. Selective download. You can specify that wget only downloads one type of files, or does not download any files. For example, the wget-m -- reject = gif http://target.web.site/subdirectory indicates downloading http://target.web.site/subdirectory, but the gif file is omitted. -- Accept = LIST acceptable file types, -- reject = LIST reject accepted file types. 5. Password and authentication. Wget can only process websites restricted by user name/password. Two parameters can be used: -- http-user = USER set HTTP user -- http-passwd = PASS set HTTP password for websites requiring certificate authentication, you can only use other download tools, such as curl. For example: wget ftp: // username: pwd@200.100.0.100-O $ path/list.html-a $ logfile-o $ path/list.html: output file-a $ logfile: logfile 6. Use the proxy server for download. If your network needs to go through the proxy server, you can have wget download files through the proxy server. Create a. wgetrc file in the current user directory. You can set the proxy server in the file: http-proxy = 111.111.111.111: 8080 ftp-proxy = 111.111.111.111: 8080 to indicate the http proxy server and the ftp proxy Server respectively. If the proxy server requires a password, use -- proxy-user = USER to set the proxy user -- proxy-passwd = PASS to set the proxy password. Use the -- proxy = on/off parameter to use or disable the proxy. Wget also has many useful functions that need to be mined by users. ========================================================== ================================================== Wget usage: wget [OPTION]... [URL]... * use wget as site image: wget-r-p-np-k http://dsec.pku.edu.cn /~ Usr_name/or wget-m http://dsec.pku.edu.cn /~ Usr_name/* download a partially downloaded file on an unstable network, and download wget-t 0-w 31-c http://dsec.pku.edu.cn/BBC.avi-o down during free time. log & or read the list of files to be downloaded from filelist wget-t 0-w 31-c-B ftp://dsec.pku.edu.cn/linuxsoft-I filelist.txt-o down. log & the above Code can also be used for downloading when the network is relatively idle. My usage is: In mozillawill not easily download the urlchain to the memory, then paste it to the filelist.txt file, and execute the second code above before going out of the system at night. * Download wget-Y on-p-k Using PROXY # Set PROXY export PROXY = http: // 211.90.168.94: 8080/# In ~ /. In wgetrc set proxy http_proxy = http://proxy.yoyodyne.com: 18023/ftp_proxy = http://proxy.yoyodyne.com: 18023/wget various options category list * Start-V, -- version displays wget versions and exit-h, -- help print syntax help-B, -- after the background is started, it is transferred to the background for execution-e, -- execute = COMMAND Execution '. wgetrc 'command. For the wgetrc format, see/etc/wgetrc or ~ /. Wgetrc * record and input file-o, -- output-FILE = FILE write record to FILE-a, -- append-output = file append record to FILE-d, -- debug print debug output-q, -- quiet mode (no output)-v, -- verbose lengthy mode (this is the default setting)-nv, -- non-verbose turn off the lengthy mode, but it is not quiet mode-I, -- input-file = FILE the URLs-F that appears in the FILE file, -- force-html treats the input FILE as an html file-B, -- base = URL uses the URL as the prefix of the relative link in the FILE specified by the-F-I parameter -- sslcertfile = FILE optional client certificate -- sslcertkey = KEYFILE optional client certificate KEYFILE -- egd-file = FILE specify E GD socket file name * download -- bind-address = ADDRESS specifies the local address (host name or ip address, used when there are multiple local IP addresses or names)-t, -- tries = NUMBER sets the maximum NUMBER of attempts (0 indicates no limit ). -O -- output-document = FILE: Write the document to the FILE-nc. -- no-clobber should not overwrite the existing FILE or use it. # prefix-c, -- continue, and then download the undownloaded file -- progress = TYPE to set the process flag-N, -- timestamping do not re-download the file except for the new-S, -- server-response: prints the response from the server -- spider does not download anything-T, -- timeout = SECONDS sets the number of SECONDS for response timeout-w, -- wait = SECONDS interval between two attempts SECONDS -- waitretr Y = SECONDS wait for 1 between the relinks... SECONDS -- random-wait waits for 0 to download... 2 * WAIT second-Y, -- proxy = on/off open or close proxy-Q, -- quota = NUMBER sets the download capacity limit -- limit-rate = RATE limits the download rate * directory-nd -- no-directories does not create directory-x, -- force-directories force create directory-nH, -- no-host-directories do not create host directory-P, -- directory-prefix = PREFIX save file to directory PREFIX /... -- cut-dirs = NUMBER ignore NUMBER layer remote directory * HTTP option -- http-user = USER set http user name to user. -- http-passwd = PASS: Set the http password PASS. -C, -- cache = on/off allow/Do Not Allow server-side data caching (generally allow ). -E, -- html-extension: Save all text/html files with the. html extension -- ignore-length ignore 'content-length' header field -- header = STRING insert STRING in headers -- proxy-user = USER settings the proxy username is USER -- proxy-passwd = PASS. Set the proxy password to PASS -- referer = URL. The HTTP request contains 'Referer: URL 'header-s, -- save-headers save the HTTP header to the file-U, -- user-agent = AGENT sets the proxy name as AGENT rather than Wget/VERSION. -- no-http-keep-alive (Link forever ). -- cookies = off do not use cookies. -- load-cookies = FILE load the cookie from the FILE before starting the session -- save-cookies = FILE save the cookies to the FILE after the session ends * FTP option-nr, -- dont-remove-listing does not remove '. listing 'file-g, -- glob = on/off open or close the globbing mechanism of the file name -- passive-ftp use passive transmission mode (default value ). -- active-ftp uses the active transmission mode -- retr-symlinks recursively directs the link to a file (rather than a directory) * recursively downloads-r, -- recursive recursively downloads-use with caution! -L, -- level = maximum recursive depth of NUMBER (inf or 0 indicates infinity ). -- delete-after the current time, partial deletion of the file-k, -- convert-links convert non-relative link to relative link-K, -- backup-converted before converting file X, back up X. orig-m, -- mirror is equivalent to-r-N-l inf-nr. -p, -- page-requisites: Download and display all images of the HTML file * include and not include (accept/reject) in recursive download-, -- accept = LIST the LIST of accepted extensions separated by semicolons-R, -- reject = LIST the LIST of accepted extensions separated by semicolons-D, -- domains = LIST the LIST of accepted domains separated by semicolons -- exclude-domains = LIST the LIST of untrusted domains separated by semicolons -- follow-ftp tracking FTP links in HTML documents -- follow- tags = LIST the LIST of Tracked HTML tags separated by semicolons-G, -- ignore-tags = LIST the LIST of ignored HTML tags separated by semicolons-H, -- span-hosts is recursively transferred to the external host-L, -- relative only traces relative links-I, -- include-directories = LIST of allowed directories-X, -- exclude-directories = LIST of excluded directories-np, -- no-parent do not trace back to the parent directory wget-S -- spider url does not show only the process without downloading ================ ================================, use example: wget-N http://XXX.com/data/ABC.zip-o ABC.zip-a dd. log & wget-c ftp: // username: password@22.11.33.195/ABC.rar-N: do not re-download an existing local file http://XXX.com/data/ABC.zip: wget filename-o ABC.zip: download and save the name-a dd. log: log-c continue. download the files that have not been downloaded.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More