Reprinted: How to Use wget

Source: Internet
Author: User
Tags mirror website ftp protocol
Reprinted: How to Use wget-general Linux technology-Linux technology and application information. The following is a detailed description. Wget [parameter list] URL
First, we will introduce the main parameters of wget:
·-B: Run wget in the background. The record file is written in the "wget-log" file in the current directory;
·-T [nuber of times]: number of attempts. Number of attempts made when wget cannot establish a connection with the server
. For example, "-t
120 "indicates 120 attempts. When this parameter is set to "0", it is very useful to specify multiple attempts until the connection is successful. When the server of the other party is suddenly shut down or the network is suddenly interrupted, you can continue downloading files that have not been uploaded after they become normal;
·
-C: resumable upload, which is also a very useful setting. When downloading a large file
If the connection is accidentally interrupted, the connection will be resumed from the last point, instead of starting from scratch.
The remote server also supports resumable data transfer. Generally, the Web/FTP Server Based on UNIX/Linux
Supports resumable data transfer;
·-T [number of seconds]: time-out period. It specifies the length of time for the connection to be interrupted when the remote server does not respond.
To start the next attempt. For example, "-T 120" indicates that if the remote server does not send data after 120 seconds, it will try again. If the network speed is faster, you can set a shorter time. On the contrary, you can set a longer time, generally up to 900, usually not less than 60, generally, it is suitable to set around 120;
·-W [number of seconds]: the number of seconds to wait between two attempts. For example, "-w 100" indicates that two attempts are waiting for 100 seconds;
·-Y on/off: connect through/without the proxy server;
·-Q [byetes]: Limit the maximum size of the downloaded file. For example, "-Q2k" indicates that the size cannot exceed 2 kb, "-Q3m" indicates that no more than 3 m bytes are allowed. If no number is appended, it indicates that the length is a single byte. For example, "-Q200" indicates that the length cannot exceed 200 bytes;
·-Nd: Do not download the directory structure. heap all files downloaded from all specified directories on the server to the current directory;
·-X: opposite to the "-nd" setting, creating a complete directory structure, for example, "wget-nd http://www.gnu.org" will create the "www.gnu.org" subdirectory under the current directory, then, it is created at the first level according to the actual directory structure of the server until all the files are uploaded;
·-NH: do not create a directory with the target host domain name as the directory name. directly store the directory structure of the target host to the current directory;
· -- Http-user = username
· -- Http-passwd = password: if the Web server needs to specify the user name and password, use these two parameters;
· -- Proxy-user = username
· -- Proxy-passwd = password: If the proxy server needs to enter the user name and password, use these two options;
·-R: Create a server directory structure on the local machine;
·-L [depth]: Download the depth of the remote server directory structure, for example, "-l 5" Download directory depth less than or equal to 5 directory structure or file;
·-M: The site image option. If you want to create a site image, use this option to automatically set other appropriate options for site images;
·-Np: only download the contents of the specified directory and Its subdirectories of the target site. This is also a very useful option. Assume that a person's personal homepage has a connection pointing to another person's personal homepage on this site, and we only want to download this person's personal homepage, if this option is not set, even -- the whole site may be captured. This is obviously
We usually don't want it;
Ü how to set the proxy server used by wget
Wget
You can use the user setting file ". wgetrc" to read many settings. Here we mainly use this file
Set the proxy server. The ". wgetrc" file in the user's main directory starts.
Function. For example, if the "root" user wants to use ". wgetrc" to set the proxy server, "/root/. wgert"
The following is a ". wge trc" file. You can refer to this example to compile your "wgetrc" file:
Http_proxy = 111.111.111.111: 8080
Ftp_proxy = 111.111.111.111: 8080
The two lines indicate that the Proxy Server IP address is 111.111.111.111 and the port number is 80. Specify
The proxy server used by the HTTP protocol. The second line specifies the proxy server used by the FTP protocol.








Usage: wget [Option]... [URL]...
The number of arguments for a command is the same as that for a long project.
Start:
-V, -- version displays the Wget version and leaves.
-H, -- help displays this instruction file.
-B,-background jumps to the background after startup.
-E,-execute = COMMAND to execute the COMMAND in '. wgetrc.
Record Files and input files:
-O, -- output-file = FILE records to FILE.
-A,-append-output = FILE: Add a message to FILE.
-D, -- debug displays the output of the debugging.
-Q, -- quiet mode (no message is input ).
-V, -- verbose lengthy mode (this is the internal value ).
-Nv, -- non-verbose disable verboseness, but not quiet mode.
-I, -- input-file = FILE: reads the URL from the FILE.
-F, -- force-html treats the input file as HTML.
Download:
-T, -- tries = NUMBER indicates the NUMBER of repeated attempts (0 is unlimited ).
-O -- output-document = FILE: Write the FILE to FILE.
-Nc, -- no-clobber does not destroy existing files.
-C, -- continue re-obtains an existing file.
-- Dot-style = STYLE: Set the display style of the retrieved status.
-N, -- timestamping does not retrieve older files than local files.
-S, -- server-response: displays the server response status.
-- Spider does not download anything.
-T, -- timeout = SECONDS: set the time that exceeds the read time to SECONDS.
-W, -- wait = SECONDS waits for SECONDS to retrieve the file.
-Y, -- proxy = on/off enable or disable Proxy.
-Q, -- quota = NUMBER: set the quota for retrieving files to NUMBER.
Directory:
-Nd -- no-directories does not create a directory.
-X, -- force-directories force Directory Creation.
-NH, -- no-host-directories do not create a host directory.
-P, -- directory-prefix = PREFIX: Save the file to PREFIX /...
-- Cut-dirs = NUMBER ignore the NUMBER of remote directory components.
Http options:
-- Http-user = USER: Set the http USER to user.
-- Http0passwd = PASS: Set the http user's password to PASS.
-C, -- cache = on/off To provide/disable data on the cache Server (normally ).
-- Ignore-length ignores the 'content-length' header field.
-- Proxy-user = USER: Set USER to the Proxy user name.
-- Proxy-passwd = PASS: Set PASS as the Proxy password.
-S, -- save-headers stores the HTTP header as a file.
-U, -- user-agent = use AGENT to replace Wget/VERSION as the identification code.
FTP options:
-- Retr-symlinks: obtains the FTP symbolic link.
-G, -- glob = on/off turn file name globbing on ot off.
-- Passive-ftp uses the "passive" transmission mode.
Use the progressive retrieval method:
-R, -- recursive is like the retrieval of inhaled web -- please be careful !.
-L, -- level = maximum value of the NUMBER returned level (0 is not limited ).
-- Delete-after: deletes the downloaded file.
-K, -- convert-links changes unrelated links to related connections.
-M, -- mirror enables the ing options.
-Nr, -- dont-remove-listing do not remove the '. listing' file.
Options for permission and rejection of a progressive job:
-A, -- accept = LIST: the LIST of extended projects allowed.
-R, -- reject = LIST: the LIST of extended projects rejected.
-D, -- domains = LIST of allowed domains.
-- Exclude-domains = LIST of rejected domains (separated by commas ).
-L, -- relative only follows the link.
-- Follow-ftp follows the FTP link in the HTML file.
-H, -- span-hosts will be handed back to the external host.
-I, -- include-directories = LIST: LIST of allowed directories.
-X, -- exclude-directories = LIST of excluded directories.
-Nh, -- no-host-lookup does not search hosts through DNS.
-Np, -- no-parent does not trace the origin directory.
Example 1: mirror a website
Wget-r www.redhat.com
Example 2: A directory under a mirror Website:
Wget-r
Www.redhat.com/mirrors/LDP



Export http_proxy = "166.111.53A.167: 3128"
Export ftp_proxy = "166.111.53A.167: 3128"
2. You can create. wgetrc for wget separately.
Http-proxy = 166.111.53.16seven: 3128
Ftp-proxy = 166.111.53.167: 3128
3. Use wget to download the entire site
# Wget-k-m-np-d -- proxy-user = usrname -- proxy-passwd = passwd http://www.hq.nasa.gov/office/pao/History/SP-468/contents.htm
-K, -- convert-links converts absolute links to relative links.
-M is equivalent to recursive download + No retrieval + infinite maximum recursive depth + no deletion of the ". listing" file unless the remote file is newer.
-Np, -- no-parent does not search for upper-level directories.
Note that-d only outputs the download information and changes to-q to "quiet.
Two other options may be used.
-B: Run wget in the background.
-C: resumable upload




Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.