Linux Command wget, linuxwget

Source: Internet
Author: User

Linux Command wget, linuxwget

Wget in Linux is a tool for downloading files. It is used in the command line. Linux users are an essential tool. We often need to download software or restore the backup from a remote server to a local server. Wget supports HTTP, HTTPS, and FTP protocols and can use HTTP proxy. The so-called automatic download means that wget can be executed in the background after the user exits the system. This means that you can log on to the system, start a wget download task, and exit the system. wget will be executed in the background until the task is completed, compared with most other browsers, users need to participate in downloading a large amount of data, which saves a lot of trouble.

Wget allows you to track links on the HTML page and download them to create the local version of the remote server, completely recreating the directory structure of the original site. This is often called recursive download ". During recursive download, wget complies with the Robot Exclusion standard (/robots.txt). wget can convert the link to a local file while downloading to facilitate offline browsing.

Wget is very stable. It has strong adaptability to unstable networks with narrow bandwidth. if the download fails due to network reasons, wget keeps trying until the entire file is downloaded. If the server interrupts the download process, it will be again connected to the server to continue the download from the stopped place. This is very useful for downloading large files from servers with limited connection time.

1. Command Format:

Wget [parameter] [URL address]

2. command functions:

This API is used to download resources from the network. If no directory is specified, the current directory is used to download resources. Although wget is powerful, it is relatively simple to use:

1) Support for resumable data transfer. This is also the biggest selling point of network ant financial and FlashGet in the past. Now, Wget can also use this function. users who are not good at the network can rest assured;

2) FTP and HTTP download modes are supported at the same time. Although most software can be downloaded through HTTP, FTP download is still required in some cases;

3) support for proxy servers. For systems with high security, generally, their systems are not directly exposed to the Internet. Therefore, support for proxy is a required function for downloading software;

4) Easy and easy to set; maybe, users who are used to the graphic interface are not too familiar with command line. However, the command line has more advantages in setting, at least, the mouse can be clicked many times, and do not worry if the mouse is wrong;

5) The program is small and completely free. The program is small and can be ignored, because the hard disk is too large now. If it is completely free, you have to consider it. Even if there are many so-called free software on the network, however, advertisements for these software are not what we like.

3. command parameters:

Startup parameters:

-V,-version: display the wget version and exit

-H,-help print syntax help

-B,-after the background is started, it is transferred to the background for execution.

-E,-execute = COMMAND: execute the COMMAND in the '. wgetrc' format. For the wgetrc format, see/etc/wgetrc or ~ /. Wgetrc

Record and input file parameters:

-O,-output-file = FILE: Write the record to the FILE file.

-A,-append-output = FILE: append the record to the FILE.

-D,-debug print debugging output

-Q,-quiet mode (no output)

-V,-verbose mode (this is the default setting)

-Nv,-non-verbose turn off the lengthy mode, but not the quiet mode

-I,-input-file = FILE: the URL that appears when the FILE is downloaded

-F,-force-html treats the input file as an HTML file

-B,-base = URL uses the URL as the prefix of the relative link in the file specified by the-F-I Parameter

-Sslcertfile = FILE: Optional client certificate

-Sslcertkey = KEYFILE: Specifies the KEYFILE of the client certificate.

-Egd-file = FILE: Specifies the file name of the EGD socket.

Download parameters:

-Bind-address = ADDRESS specifies the local address (host name or ip address, used when there are multiple local IP addresses or names)

-T,-tries = NUMBER indicates the maximum NUMBER of attempts (0 indicates no limit ).

-O-output-document = FILE: Write the document to the FILE.

-Nc,-no-clobber do not overwrite existing files or use the. # prefix

-C,-continue, and then download the files that have not been downloaded

-Progress = TYPE: set the process bar flag.

-N,-timestamping do not re-download the file except for non-newer than the local file

-S,-server-response Print server response

-Spider does not download anything.

-T,-timeout = SECONDS: set the number of SECONDS for response timeout.

-W,-wait = SECONDS: the interval between two attempts is SECONDS.

-Waitretry = SECONDS: Wait 1... SECONDS

-Random-wait waits for 0 during download... 2 * WAIT seconds

-Y,-proxy = on/off open or close the proxy

-Q,-quota = NUMBER sets the download capacity limit

-Limit-rate = RATE: Specifies the download rate.

Directory parameters:

-Nd-no-directories

-X,-force-directories force Directory Creation

-NH,-no-host-directories do not create the host directory

-P,-directory-prefix = PREFIX: save the file to the directory PREFIX /...

-Cut-dirs = NUMBER ignore the remote directory of the NUMBER layer

HTTP option parameters:

-Http-user = USER: Set the http user name to user.

-Http-passwd = PASS: Set the http password to PASS.

-C,-cache = on/off allow/Do Not Allow server-side data cache (generally allow)

-E,-html-extension: Save all text/html files with the. html extension

-Ignore-length: ignore the 'content-length' header.

-Header = STRING insert STRING in headers

-Proxy-user = USER: Set the proxy username to USER.

-Proxy-passwd = PASS: Set the proxy password to PASS.

-Referer = the URL contains the 'Referer' header in the HTTP request.

-S,-save-headers save the HTTP header to the file

-U,-user-agent = AGENT: Set the proxy name to AGENT instead of Wget/VERSION.

-No-http-keep-alive: Disable the HTTP activity Link (permanent link)

-Cookies = off do not use cookies

-Load-cookies = FILE: loads cookies from the FILE before starting the session

-Save-cookies = FILE: saves cookies to the FILE after the session ends.

FTP option parameters:

-Nr,-dont-remove-listing does not remove the '. listing' file.

-G,-glob = on/off enable or disable the globbing mechanism of the file name

-Passive-ftp uses passive transmission mode (default ).

-Active-ftp: active Transmission Mode

-Retr-symlinks: recursively points a link to a file instead of a directory)

Recursive download parameters:

-R,-recursive download-use with caution!

-L,-level = maximum recursive depth of NUMBER (inf or 0 indicates infinity)

-Delete-after: Partial Deletion of objects after completion

-K,-convert-links converts non-relative links to relative links

-K,-backup-converted: Before converting file X, back up the file to X. orig.

-M,-mirror is equivalent to-r-N-l inf-nr.

-P,-page-requisites download and display all images of HTML files

Recursive download contains and does not contain (accept/reject ):

-A,-accept = list a semicolon-separated LIST of accepted extensions

-R,-reject = LIST a semicolon-separated LIST of unacceptable extensions

-D,-domains = LIST the LIST of accepted domains separated by semicolons

-Exclude-domains = LIST a semicolon-separated LIST of unacceptable domains

-Follow-ftp: Tracking FTP links in HTML documents

-Follow-tags = LIST a semicolon-separated LIST of HTML tags to be tracked

-G,-ignore-tags = LIST semicolon-separated LIST of ignored HTML tags

-H and-span-hosts are recursively transferred to the external host.

-L,-relative only traces relative links

-I,-include-directories = LIST of permitted directories

-X,-exclude-directories = LIST of excluded directories

-Np,-no-parent should not be traced back to the parent directory

Wget-S-spider url does not show only the process of downloading

4. Example:

Example 1: Use wget to download a single file

Command:

Wget http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Note:

The following example downloads a file from the network and stores it in the current directory. a progress bar is displayed during the download process, including (download completion percentage, downloaded bytes, current download speed, download time remaining ).

Example 2: Use wget-O to download and save it with different file names

Command:

: Wget-O wordpress.zip http://www.minjieren.com/download.aspx? Id = 1080

Note:

By default, wget will run the command with the character following "/". For dynamic link download, the file name is usually incorrect.

Error: The following example downloads an object and uses the name download. aspx? Save id = 1080

Wget http://www.minjieren.com/download? Id = 1

Even if the downloaded file is in zip format, it still uses download. php? Id = 1080 command.

Correct: To solve this problem, we can use the-O parameter to specify a file name:

Wget-O wordpress.zip http://www.minjieren.com/download.aspx? Id = 1080

Instance 3: Use wget-limit-rate to speed up download

Command:

Wget -- limit-rate = 300 k http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Note:

When you execute wget, it will use all possible bandwidth downloads by default. However, when you are preparing to download a large file and you need to download other files, it is necessary to limit the speed.

Example 4: Use wget-c for resumable Data Transfer

Command:

Http://www.minjieren.com/wordpress-3.1-zh_CN.zip wget-c

Note:

Using wget-c to restart a file with an interrupted download may be helpful when we suddenly fail to download a large file due to network or other reasons. We can continue to download the file instead of downloading it again. You can use the-c parameter to continue interrupted downloads.

Instance 5: Use wget-B for background download

Command:

Http://www.minjieren.com/wordpress-3.1-zh_CN.zip wget-B

Note:

When downloading a very large file, we can use parameter-B for background download.

Http://www.minjieren.com/wordpress-3.1-zh_CN.zip wget-B

Continuing in background, pid 1840.

Output will be written to 'wget-log '.

You can run the following command to view the download progress:

Tail-f wget-log

Instance 6: Download disguised proxy name

Command:

Wget -- user-agent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16" http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Note:

Some websites can reject your download requests by judging that the proxy name is not a browser. However, you can use the-user-agent parameter for disguise.

Example 7: Use wget-spider to test the download link

Command:

Wget -- spider URL

Note:

When you plan to perform regular download, you should test whether the download link is valid at the specified time. We can add the-spider parameter for check.

Wget -- spider URL

If the download link is correct

Wget -- spider URL

Spider mode enabled. Check if remote file exists.

HTTP request sent, awaiting response... 200 OK

Length: unspecified [text/html]

Remote file exists and cocould contain further links,

But recursion is disabled -- not retrieving.

This ensures that the download can be performed at the specified time. However, when you send a wrong link, the following error will be displayed:

Wget -- spider url

Spider mode enabled. Check if remote file exists.

HTTP request sent, awaiting response... 404 Not Found

Remote file does not exist -- broken link !!!

You can use the spider parameter in the following situations:

Check Before scheduled download

Interval detection for website availability

Check dead links on the website page

Instance 8: Use wget-tries to increase the number of retries

Command:

Wget -- tries = 40 URL

Note:

If the network is faulty or a large file is downloaded, it may fail. By default, wget retries 20 times to download files. If needed, you can use-tries to increase the number of retries.

Instance 9: Use wget-I to download multiple files

Command:

Wget-I filelist.txt

Note:

First, save a download link file.

Cat> filelist.txt

Url1

Url2

Url3

Url4

Use this file and the parameter-I to download it.

Instance 10: Use wget-mirror to mirror the website

Command:

Wget -- mirror-p -- convert-links-P./LOCAL URL

Note:

Download the entire website to your local device.

-Miror: Download the Account Opening Image

-P: download all the files normally displayed for the html page.

-Convert-links: The local link for conversion after download

-P./LOCAL: Save all files and directories to the specified LOCAL directory.

Example 11: Use wget-reject to filter the specified format for download

Command:
Wget -- reject = gif ur

Note:

Download a website, but you do not want to download images, you can use the following command.

Instance 12: Use wget-o to save the downloaded information to the log file.

Command:

Wget-o download. log URL

Note:

You can use

Instance 13: Use wget-Q to limit the total size of downloaded files

Command:

Wget-Q5m-I filelist.txt

Note:

When you want to download more than 5 MB of the file and exit the download, you can use. Note: This parameter does not work for downloading a single object and is only valid for Recursive download.

Example 14: Use wget-r-A to download A file in the specified format

Command:

Wget-r -A.pdf url

Note:

You can use this function in the following situations:

Download all images of a website

Download all videos from a website

Download all PDF files from a website

Example 15: Use wget FTP to download

Command:

Wget ftp-url

Wget -- ftp-user = USERNAME -- ftp-password = PASSWORD url

Note:

You can use wget to download the ftp link.

Use wget for anonymous ftp download:

Wget ftp-url

Download ftp with wget user name and password authentication

Wget -- ftp-user = USERNAME -- ftp-password = PASSWORD url

Remarks: Compilation and Installation

Run the following command to compile and install the SDK:

# Tar zxvf wget-1.9.1.tar.gz

# Cd wget-1.9.1

#./Configure

# Make

# Make install

Original article: http://www.cnblogs.com/peida/archive/2013/03/18/2965369.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.