Linux commands-Tools for downloading files: wget

Source: Internet
Author: User
Tags save file

The wget in a Linux system is a tool for downloading files, which is used at the command line. For Linux users is an essential tool, we often have to download some software or restore backup from a remote server to the local server. The wget supports HTTP,HTTPS and FTP protocols and can use HTTP proxies. The so-called automatic download means that the wget can be executed in the background after the user exits the system. This means that you can log in to the system, start a wget download task, and then exit the system, and wget will be executed in the background until the task is completed, which saves a lot of hassle when the user needs to be involved in downloading large amounts of data from most other browsers.

Wget can follow the links on the HTML page and download it to create a local version of the remote server, completely rebuilding the directory structure of the original site. This is often referred to as a "recursive download". In the recursive download, wget follows the robot Exclusion Standard (/robots.txt). wget can switch links to local files for offline browsing while downloading.

The wget is very stable, and it has a strong adaptability in the case of very narrow bandwidth and unstable networks. If the download fails because of the network, wget will keep trying until the entire file is downloaded. If the server interrupts the download process, it will again be linked to the server to continue downloading from where it stopped. This is useful for downloading large files from servers that have limited link times.

1. Command format:

wget [parameters] [URL address]

2. Command function:

Used to download resources from the network, without specifying directories, to download the resources back to default as the current directory. Wget, though powerful, is relatively simple to use:

1) Support the breakpoint down-pass function; This is also the network Ant and flashget the biggest selling point of the year, now, wget can also use this feature, those networks are not too good users can rest assured;

2) support both FTP and HTTP download mode, although most of the software can now be downloaded using HTTP, but, in some cases, still need to use FTP mode to download software;

3) Support proxy Server, for the security of high-intensity systems, generally do not expose their own systems directly on the Internet, so, the support agent is necessary to download software features;

4) Easy to set up, may be accustomed to the GUI user is not too accustomed to command line, but, command line in the settings in fact there are more advantages, at least, the mouse can be less points many times, do not worry about whether the mouse is wrong point;

5) The program is small, completely free, the program is small can be considered, because the current hard disk is too big, completely free to consider, even if there are many so-called free software, but these software ads are not our favorite.

3. Command parameters:

Startup parameters:

-v,–version show wget version and exit

-H,–HELP Print Syntax Help

-b,–background Boot to background execution

-e,–execute=command execute '. Wgetrc ' Format command, WGETRC format see/ETC/WGETRC or ~/.WGETRC

Record and input file parameters:

-o,–output-file=file Write records to file

-a,–append-output=file Append records to File

-d,–debug Print Debug output

-q,–quiet Quiet mode (no output)

-v,–verbose Verbose mode (this is the default setting)

-nv,–non-verbose Turn off verbose mode, but not quiet mode

-i,–input-file=file download URLs that appear in file files

-f,–force-html treats the input file as an HTML format file

-b,–base=url the URL as the relative link prefix that appears in the file specified by the-f-i parameter

–sslcertfile=file Optional Client certificate

–sslcertkey=keyfile Optional Client certificate keyfile

–EGD-FILE=FILE Specifies the file name of the EGD socket

Download parameters:

–bind-address=address specifies the local use address (host name or IP, used when there are multiple IPs or names locally)

-t,–tries=number sets the maximum number of attempts to link (0 means no limit).

-o–output-document=file writing documents to file files

-nc,–no-clobber do not overwrite existing files or use. #前缀

-c,–continue then download the files that are not downloaded

–progress=type Setting the Process bar flag

-n,–timestamping do not download files again unless newer than local files

-s,–server-response the print server response

–spider don't load anything.

-t,–timeout=seconds setting the number of seconds for response timeout

-w,–wait=seconds interval between two attempts SECONDS seconds

–waitretry=seconds wait between Relink 1 ... Seconds sec

–random-wait wait between downloads 0 ... 2*wait sec

-y,–proxy=on/off turning the agent on or off

-q,–quota=number setting capacity Limits for downloads

–limit-rate=rate Limit Download Transmission rate

Directory Parameters:

-nd–no-directories do not create a directory

-x,–force-directories forcing a directory to be created

-nh,–no-host-directories do not create a host directory

-p,–directory-prefix=prefix save file to directory prefix/...

–cut-dirs=number Ignore number layer remote directory

HTTP option Parameters:

–http-user=user set the HTTP username to user.

–http-passwd=pass set HTTP password to PASS

-c,–cache=on/off Allow/Disallow server-side data caching (typically allowed)

-e,–html-extension Save all text/html documents with an. html extension

–ignore-length Ignore ' content-length ' header fields

–header=string inserting strings in headers string

–proxy-user=user set the user name of the agent

–proxy-passwd=pass set the password for the agent to PASS

–referer=url include ' Referer:url ' header in HTTP request

-s,–save-headers saving HTTP headers to a file

-u,–user-agent=agent set the agent name as agent instead of Wget/version

–no-http-keep-alive Close HTTP Activity link (forever link)

–cookies=off Do not use cookies

–load-cookies=file loading a cookie from a file before starting a session

–save-cookies=file cookies are saved to the file after the session ends

FTP option Parameters:

-nr,–dont-remove-listing do not remove '. Listing ' files

-g,–glob=on/off globbing mechanism for opening or closing filenames

The –PASSIVE-FTP uses the passive transfer mode (the default value).

–active-ftp using active transfer mode

–retr-symlinks the link to the file (not the directory) at the time of recursion

Recursive download parameters:

-r,–recursive recursive download--use with caution!

-l,–level=number maximum recursion depth (INF or 0 for Infinity)

–delete-after Delete files locally after it is finished

-k,–convert-links Convert non-relative links to relative links

-k,–backup-converted back to X.orig before converting file X

-m,–mirror equivalent to-r-n-l INF-NR

-p,–page-requisites Download all pictures showing HTML files

Included and not included in the recursive Download (accept/reject):

-a,–accept=list a semicolon-delimited list of accepted extensions

-r,–reject=list semicolon-delimited list of non-accepted extensions

-d,–domains=list a semicolon-delimited list of accepted domains

–exclude-domains=list semicolon-delimited list of domains that are not accepted

–follow-ftp Tracking of FTP links in HTML documents

–follow-tags=list a semicolon-delimited list of tracked HTML tags

-g,–ignore-tags=list a semicolon-delimited list of ignored HTML tags

-h,–span-hosts go to external host when recursion

-l,–relative only tracks relative links

-i,–include-directories=list List of allowed directories

-x,–exclude-directories=list List of directories not included

-np,–no-parent don't go back to the parent directory

Wget-s–spider URL does not download only the display process

4. Usage examples:

Example 1: Downloading a single file using wget

Command:

wget http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Description

The following example is to download a file from the network and save in the current directory, the download process will show a progress bar, including (download complete percentage, the downloaded bytes, the current download speed, the remaining download time).

Example 2: Download with Wget-o and save with a different file name

Command:

Wget-o wordpress.zip http://www.minjieren.com/download.aspx?id=1080

Description

wget defaults to the last character that matches the "/" and the file name for dynamically linked downloads is usually incorrect.

Error: The following example downloads a file and saves it by name download.aspx?id=1080

wget http://www.minjieren.com/download?id=1

Even if the downloaded file is in the zip format, it still takes the download.php?id=1080 command.

Correct: To solve this problem, we can use the parameter-o to specify a file name:

Wget-o wordpress.zip http://www.minjieren.com/download.aspx?id=1080

Example 3: Download with wget–limit-rate speed limit

Command:

wget--limit-rate=300k http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Description

When you execute wget, it will use all possible broadband downloads by default. But when you're ready to download a large file and you still need to download other files, it's necessary to limit the speed.

Example 4: Using the Wget-c breakpoint to continue the transmission

Command:

Wget-c http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Description

Using wget-c to restart the download of interrupted files is very helpful when we download large files for reasons such as network interruption, we can continue downloading instead of downloading a file again. You can use the-c parameter when you need to continue the interrupted download.

Example 5: Using wget-b background download

Command:

Wget-b http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Description

For downloading very large files, we can use the parameter-B to download the background.

Wget-b Http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Continuing in background, PID 1840.

Output'll is written to ' Wget-log '.

You can use the following command to view the download progress:

Tail-f Wget-log

Example 6: Masquerading proxy name download

Command:

wget--user-agent="mozilla/5.0 (Windows; U Windows NT 6.1; En-US) applewebkit/534.16 (khtml, like Gecko) chrome/10.0.648.204 safari/534.16" http://  Www.minjieren.com/wordpress-3.1-zh_CN.zip

Description

Some websites may reject your download request by judging the proxy name as not a browser. But you can disguise it by –user-agent parameters.

Example 7: Using the Wget–spider test download link

Command:

wget--spider URL

Description

When you plan to do a timed download, you should test the download link at the scheduled time to see if it is valid. We can increase the –spider parameter to check.

wget--spider URL

If the download link is correct, it will show

wget--spider URL

Spider mode enabled. Check if remote file exists.

HTTP request sent, awaiting response ... OK

length:unspecified [text/html]

Remote file exists and could contain further links,

But recursion was disabled-not retrieving.

This ensures that the download can take place at the scheduled time, but when you give the wrong link, the following error will be displayed

wget--spider URL

Spider mode enabled. Check if remote file exists.

HTTP request sent, awaiting response ... 404 Not Found

Remote file does not exist--broken link!!!

You can use the spider parameter in the following situations:

Check before scheduled download

Interval detect whether a site is available

Check for dead links on site pages

Example 8: Increase the number of retries with wget–tries

Command:

wget--tries= URL

Description

It is also possible to fail if the network is having problems or downloading a large file. wget default retry 20 connection download file. If necessary, you can use –tries to increase the number of retries.

Example 9: Downloading multiple files using wget-i

Command:

Wget-i filelist.txt

Description

First, save a copy of the download link file

Cat > Filelist.txt

Url1

Url2

Url3

Url4

Then use this file and parameters-I download

Example 10: Using Wget–mirror Mirror Web site

Command:

wget--mirror-p--convert-links-p./local URL

Description

Download the entire site to local.

–miror: Account opening image download

-P: Download all files for HTML page to display normal

–convert-links: After download, convert cost to link

-P./local: Save all files and directories to a locally specified directory

Example 11: Use Wget–reject to filter the specified format for download

Command:

wget--reject=gif ur

Description

To download a website, but you do not want to download pictures, you can use the following commands.

Example 12: Use Wget-o to save download information to a log file

Command:

Wget-o Download.log URL

Description

Do not want the download information to be displayed directly in the terminal but in a log file that can be used

Example 13: Limit total download file size using Wget-q

Command:

Wget-q5m-i filelist.txt

Description

When you want to download files over 5M and exit the download, you can use. Note: This parameter does not work for a single file download and is only valid for recursive downloads.

Example 14: Using Wget-r-A to download the specified format file

Command:

Wget-r-a.pdf URL

Description

You can use this feature in the following situations:

Download all pictures of a website

Download all videos of a website

Download all PDF files for a website

Example 15: Download using wget FTP

Command:

wget ftp---ftp-user=username--ftp-password=password URL

Description

You can use wget to complete the download of the FTP link.

Using wget anonymous FTP download:

wget Ftp-url

FTP download with wget user name and password authentication

wget--ftp-user=username--ftp-password=password URL

Remarks: Compiling the installation

Compile the installation using the following command:

# tar ZXVF wget-1.9.1.tar.gz

# CD wget-1.9.1

#./configure

# make

# make Install

Linux commands-Tools for downloading files: wget

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.