Wget usage (command) Details

Source: Internet
Author: User
Tags ftp protocol

Use of wget

1) Support for resumable download (2) Both FTP and HTTP download (3) support for proxy server (4) easy to set; 5) small programs, completely free of charge;

Command Format:

Wget [parameter list] [target software and web site]

1. startup parameters

This type of parameter mainly provides some basic information about the software;

-V, -- version: displays the software version number and exits;
-H, -- help: displays the software help information;
-E, -- execute = command to execute a ". wgetrc" command

Each of the above functions has two parameters, which are the same as the long function and can be used. Note that the-e parameter is used to execute a. wgettrc command. The. wgettrc command is actually a list of parameters, which can be written together directly.

2. File Processing Parameters

This type of parameter defines the output mode of the software log file;

-O, -- output-file = file: Save the software output information to the file;
-A, -- append-output = file: append the software output information to the file;
-D, -- debug displays the output information;
-Q, -- Quiet does not display output information;
-I, -- input-file = file: Get the URL from the file;

The above parameters are useful to attackers. Let's take a look at the specific usage;



Example 1: Download the Home Page 192.168.1.168 and display the download information
Wget-d
Http: // 192.168.1.168

Example 2: Download the Home Page 192.168.1.168 without displaying any information
Wget-Q
Http: // 192.168.1.168

Example 3: download all the files of the link contained in filelist.txt
Wget-I filelist.txt



Wget-NP-m-L5
Http://jpstone.bokee.com


// Do not download the content of other sites linked to this site, Level 5 directory structure
3. Download Parameters

The download parameter defines the number of download repetitions and the file name to be saved;

-T, -- tries = number indicates the number of downloads (0 indicates infinite times)
-O -- output-document = file: The downloaded file is saved as another file name.
-NC, -- no-clobber do not overwrite existing files
-N, -- timestamping only downloads new files than local
-T, -- timeout = seconds
-Y, -- proxy = On/Off disable proxy

For example, download the first page of 192.168.1.168and Save the input information in the download process to the test.htm file.
Wget-O test.htm
Http: // 192.168.1.168

4. Directory Parameters

Directory parameters mainly set the correspondence between the directory for saving the downloaded file and the directory of the original file (Server File;

-Nd -- no-Directories
-X, -- force-directories force Directory Creation
Maybe we don't know much about the directory. Let's look at an example.

For example, download the home page of 192.168.1.168 and keep the website structure
Wget-x
Http: // 192.168.1.168




5. HTTP parameters

Set HTTP parameters to attributes related to HTTP downloads;

-- Http-user = User: Set the HTTP user
-- Http-passwd = pass: Set the HTTP Password
-- Proxy-user = User: Set proxy user
-- Proxy-passwd = pass: sets the proxy password.

The above parameters mainly set the user and password for HTTP and proxy;

6. recursive parameter settings

When downloading a directory of a website or website, we need to know the download level, and these parameters can be set;
-R, -- Recursive download the entire website and directory (use it with caution)
-L, -- level = Number download level

Example: Download the entire website
Wget-R
Http: // 192.168.1.168

7. Recursive allow and deny option Parameters

When downloading a website, some files can be downloaded as quickly as possible, such as clips and sounds, which can be set here;

-A, -- accept = list acceptable file types
-R, -- reject = file type rejected by list
-D, -- domains = list acceptable domain names
-- Exclude-domains = List rejected Domain Name
-L, -- relative download link
-- Follow-FTP: only download the FTP Link
-H, -- span-hosts can download external hosts
-I, -- include-directories = List Directory
-X, -- exclude-directories = List reject directory



How to set the proxy server used by wget
Wget can use the user setting file ". wgetrc" to read many settings. Here we mainly use this file
Set the proxy server. The ". wgetrc" file in the user's main directory starts.
Function. For example, if the "root" user wants to use ". wgetrc" to set the proxy server, start with "/root/. wgetrc ".
The following shows the content of a ". wgetrc" file. You can refer to this example to compile your own "wgetrc" file:
HTTP-proxy = 111.111.111.111: 8080
FTP-proxy = 111.111.111.111: 8080
The two lines indicate that the Proxy Server IP address is 111.111.111.111 and the port number is 80. Specify
The proxy server used by the HTTP protocol. The second line specifies the proxy server used by the FTP protocol.

 



Wget User Guide
Wget is a free tool for automatically downloading files from the network. It supports HTTP, https, and FTP protocols and can use HTTP proxy.

The so-called automatic download means that wget can be executed in the background after the user exits the system. This means that you can log on to the system, start a wget download task, and exit the system. wget will be executed in the background until the task is completed, compared with most other browsers, users need to participate in downloading a large amount of data, which saves a lot of trouble.

Wget can track the links on the HTML page and download them to create the local version of the remote server, completely recreating the directory of the original site
Structure. This is often called recursive download ". During recursive download, wget follows the robot exclusion standard (/robots.txt ).
Wget can be downloaded and converted to point to a local file to facilitate offline browsing.

Wget is very stable, and it has strong adaptability to unstable networks in the case of narrow bandwidth. If the download is lost due to network reasons
Failed, wget will continue to try until the entire file is downloaded
Bi. If the server interrupts the download process, it will be again connected to the server to continue the download from the stopped place. This is very useful for downloading large files from servers with limited connection time.

Common wget usage
Wget format

Usage: wget [Option]... [url]... use wget for site image:
Wget-r-p-NP-K
Http://dsec.pku.edu.cn /~ Usr_name/



# Or
Wget-m
Http://www.tldp.org/LDP/abs/html/


Download a part of the downloaded files from an unstable network and during free time.
Wget-T 0-W 31-C
Http://dsec.pku.edu.cn/BBC.avi


-O down. log &
# Or read the list of files to be downloaded from filelist
Wget-T 0-W 31-C-B
Ftp://dsec.pku.edu.cn/linuxsoft



-I filelist.txt-O down. Log
& The above Code can also be used for downloading when the network is relatively idle. My usage is: in Mozilla, copy the URL link that is not convenient to download to the memory and paste it to the file.
In filelist.txt, execute the second code above before going out of the system at night.

Download via proxy
Wget-y on-p-K
Https://sourceforge.net/projects/wvware/


The proxy can be set in the environment variable or wgetrc file.

# Set proxy in Environment Variables
Export proxy = http: // 211.90.168.94: 8080/
# In ~ /. Set proxy in wgetrc
Http_proxy =
Http://proxy.yoyodyne.com: 18023/



Ftp_proxy =
Http://proxy.yoyodyne.com: 18023/wget


Categories of various options
Start
-V, -- version: displays the wget version and exits.
-H, -- help print syntax help
-B, -- after the background is started, it is transferred to the background for execution.
-E, -- execute = command: Execute the command in the '. wgetrc' format. For the wgetrc format, see/etc/wgetrc or ~ /. Wgetrc record and input file
-O, -- output-file = file: Write the record to the file.
-A, -- append-output = file: append the record to the file.
-D, -- debug print debugging output
-Q, -- Quiet quiet mode (no output)
-V, -- verbose lengthy mode (this is the default setting)
-NV, -- Non-verbose turn off the lengthy mode, but not the quiet mode
-I, -- input-file = file: the URL that appears when the file is downloaded.
-F, -- force-HTML treats the input file as an HTML file
-B, -- base = URL uses the URL as the prefix of the relative link in the file specified by the-f-I Parameter
-- Sslcertfile = file: Optional client certificate
-- Sslcertkey = Keyfile the Keyfile of the client certificate is optional.
-- EGD-file = file: Specifies the file name download for the EGD socket.
-- Bind-address = address specifies the local address (host name or IP address, used when there are multiple local IP addresses or names)
-T, -- tries = Number sets the maximum number of attempts to connect (0 indicates no limit ).
-O -- output-document = file: Write the document to the file.
-NC, -- no-clobber do not overwrite existing files or use the. # prefix
-C, -- continue, and then download the files that have not been downloaded
-- Progress = type: set the process bar flag
-N, -- timestamping
-S, -- server-response print the Server Response
-- Spider does not download anything
-T, -- timeout = seconds: set the number of seconds for response timeout.
-W, -- Wait = seconds: the interval between two attempts is seconds.
-- Waitretry = waiting for 1... seconds between reconnections
-- Random-Wait waits for 0 seconds between downloads... 2 * Wait
-Y, -- proxy = On/Off open or close the proxy
-Q, -- quota = Number: Set the download capacity limit.
-- Limit-rate = Rate: Specifies the directory for downloading rates.
-Nd -- no-Directories
-X, -- force-directories force Directory Creation
-NH, -- no-host-directories do not create the host directory
-P, -- directory-Prefix = Prefix: save the file to the directory prefix /...
-- Cut-dirs = Number ignore the HTTP option of the remote directory at the number layer
-- Http-user = User: Set the HTTP user name to user.
-- Http-passwd = pass: Set the HTTP password to pass.
-C, -- cache = On/Off allow/Do Not Allow server-side data caching (generally allow ).
-E, -- HTML-Extension: Save all text/html files with the. html Extension
-- Ignore-length ignore the 'content-length' header field
-- Header = string insert string in Headers
-- Proxy-user = User: Set the proxy username to user.
-- Proxy-passwd = pass: Set the proxy password to pass
-- Referer = the URL contains the 'Referer' header in the HTTP request.
-S, -- save-headers Save the HTTP header to the file
-U, -- User-Agent = agent: Set the proxy name to agent instead of wget/version.
-- No-http-keep-alive disable the HTTP activity Link (permanent link ).
-- Cookies = off do not use cookies.
-- Load-Cookies = file: load the cookie from the file before starting the session
-- Save-Cookies = File Save cookies to file FTP after the session ends Option
-Nr, -- Dont-Remove-listing: Do not remove the '. listing' file.
-G, -- glob = On/Off enable or disable the globbing mechanism of the file name
-- Passive-FTP uses passive transmission mode (default ).
-- Active-FTP Active Transmission Mode
-- Retr-symlinks recursively downloads the link to a file instead of a directory during recursion.
-R, -- Recursive recursive download-use with caution!
-L, -- level = maximum recursive depth of number (inf or 0 indicates infinity ).
-- Delete-after: Partial Deletion of objects after completion
-K, -- convert-links converts non-relative links to relative links
-K, -- backup-converted: Back up the file to X. orig before converting file x
-M, -- mirror is equivalent to-r-N-l INF-Nr.
-P, -- page-requisites download show the inclusion and non-inclusion (accept/reject) of all images in the HTML file)
-A, -- accept = List a semicolon-separated list of accepted extensions
-R, -- reject = List a semicolon-separated list of unacceptable extensions
-D, -- domains = List the list of accepted domains separated by semicolons
-- Exclude-domains = List a semicolon-separated list of unacceptable Domains
-- Follow-FTP: Tracking FTP links in HTML documents
-- Follow-tags = List a semicolon-separated list of HTML tags to be tracked
-G, -- ignore-tags = List semicolon-separated list of ignored HTML tags
-H, -- span-hosts is recursively transferred to the external host
-L, -- relative only traces relative links
-I, -- include-directories = List list of allowed Directories
-X, -- exclude-directories = List list of excluded Directories
-NP, -- no-parent should not be traced back to the parent directory

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.