One Linux command per day: wget command detailed _linux

Source: Internet
Author: User
Tags create directory file size html tags parent directory save file

Wget in a Linux system is a tool for downloading files, which is used at the command line. For Linux users is an essential tool, we often have to download some software or restore from the remote server back to the local server. Wget supports the HTTP,HTTPS and FTP protocols and can use HTTP proxies. The so-called automatic download means that wget can be executed in the background after the user exits the system. This means that you can log on to the system, start a wget download task, and then exit the system, and wget will perform in the background until the task is completed, which saves a lot of trouble when the user is involved in downloading a large amount of data, as opposed to most other browsers.

Wget can track links on HTML pages to download to create a local version of the remote server, completely rebuilding the original site's directory structure. This is often referred to as a "recursive download." In the recursive download, wget follows the robot Exclusion Standard (/robots.txt). Wget can be downloaded at the same time, the link to the local file to point to facilitate offline browsing.

The wget is very stable and has a strong adaptability in the case of very narrow bandwidth and unstable networks. If the download fails for the reason of the network, wget will keep trying until the entire file is downloaded. If the server interrupts the download process, it will again be linked to the server and continue downloading from where it stopped. This is useful for downloading large files from servers that have limited link time.

1. Command format:

wget [parameters] [URL address]

2. Command function:

Used to download resources from the network without specifying a directory, and the download resource defaults to the current directory. Wget is very powerful, but it is simpler to use:

1 Support breakpoint down-pass function; This is also the network Ant and flashget the biggest selling point of the year, now, wget also can use this function, those network is not too good users can be assured;

2 support both FTP and HTTP downloads; Although most of the software can now be downloaded using HTTP, there are times when you still need to use FTP to download software;

3 support proxy server; For the system with high security intensity, the system will not be directly exposed to the Internet, so the support agent is the necessary function to download the software;

4) set convenient and simple; it is possible that the user of the custom graphical interface is not too accustomed to command line, but the command line in the setting of actually have more advantages, at least, the mouse can be less than many times, do not worry about whether the mouse is wrong;

5 program small, completely free of charge, the program small can be considered, because now the hard disk is too big; completely free to consider, even if there are many so-called free software on the Internet, but these software ads are not our favorite.

3. Command parameters:

Startup parameters:

    • -v,–version Displays the wget version and exits after
    • -H,–HELP Print Syntax Help
    • -b,–background to background execution after startup
    • -e,–execute=command execute '. Wgetrc ' Format command, WGETRC format see/ETC/WGETRC or ~/.WGETRC

Logging and input file parameters:

    • -o,–output-file=file writes records to file
    • -a,–append-output=file Append records to File
    • -d,–debug Print Debug output
    • -q,–quiet Quiet mode (no output)
    • -v,–verbose Verbose mode (this is the default setting)
    • -nv,–non-verbose Turn off verbose mode, but not quiet mode
    • -i,–input-file=file download URLs that appear in file files
    • -f,–force-html treats the input file as an HTML format file
    • -b,–base=url the URL as the prefix of the relative link that appears in the file specified in the-f-i parameter
    • –sslcertfile=file Optional Client certificate
    • –sslcertkey=keyfile Optional Client certificate keyfile
    • –EGD-FILE=FILE Specifies the filename of the EGD socket

Download parameters:

    • –bind-address=address specifies a local use address (hostname or IP, used when there are multiple IP or names locally)
    • -t,–tries=number sets the maximum number of attempts to link (0 indicates no limit).
    • -o–output-document=file writes documents to file
    • -nc,–no-clobber do not overwrite existing files or use. #前缀
    • -c,–continue then downloads the files that have not been downloaded.
    • –progress=type Set Process Bar tag
    • -n,–timestamping do not reload the file unless it is newer than the local file
    • -s,–server-response the print server response
    • –spider not to carry anything.
    • -t,–timeout=seconds the number of seconds to set the response timeout
    • -w,–wait=seconds interval SECONDS seconds between attempts two times
    • –waitretry=seconds wait 1 between relink ... Seconds seconds
    • –random-wait wait 0 between downloads ... 2*wait seconds
    • -y,–proxy=on/off turn on or off the agent
    • -q,–quota=number set capacity limits for downloads
    • –limit-rate=rate Limited Download rate

Table of Contents parameters:

    • -nd–no-directories does not create a directory
    • -x,–force-directories Force Create Directory
    • -nh,–no-host-directories does not create a host directory
    • -p,–directory-prefix=prefix save file to directory prefix/...
    • –cut-dirs=number Ignore number layer remote directory

HTTP option Parameters:

    • –http-user=user sets the HTTP username to user.
    • –http-passwd=pass set HTTP password to pass
    • -c,–cache=on/off allows/does not allow server-side data caching (generally allowed)
    • -e,–html-extension saves all text/html documents with an. html extension
    • –ignore-length Ignore ' content-length ' header field
    • –header=string inserts string strings in headers
    • –proxy-user=user Set Agent user name
    • –proxy-passwd=pass set the agent's password to pass
    • –referer=url contains ' referer:url ' headers in HTTP requests
    • -s,–save-headers Save HTTP headers to file
    • -u,–user-agent=agent Set agent name as Agent instead of Wget/version
    • –no-http-keep-alive Close HTTP Activity link (forever link)
    • –cookies=off does not use cookies
    • –load-cookies=file load cookies from file files before the session starts
    • –save-cookies=file save cookies to file in the end of session

FTP option Parameters:

    • -nr,–dont-remove-listing do not remove '. Listing ' files
    • -g,–glob=on/off globbing mechanism for opening or closing filenames
    • –PASSIVE-FTP uses the passive transfer mode (the default value).
    • –active-ftp Use active transfer mode
    • –retr-symlinks the link to a file (not a directory) when recursive

Recursive download parameters:

    • -r,–recursive recursive download--use with caution!
    • -l,–level=number maximum recursive depth (INF or 0 represents infinity)
    • –delete-after deletes the file locally after the current completion
    • -k,–convert-links Convert non-relative links to relative links
    • -k,–backup-converted to back up the file X before converting it to X.orig
    • -m,–mirror equivalent to-r-n-l INF-NR
    • -p,–page-requisites Download all pictures that display HTML files

Included and not included in the recursive Download (accept/reject):

    • -a,–accept=list semicolon-delimited list of accepted extension names
    • -r,–reject=list semicolon-delimited list of unacceptable extension names
    • -d,–domains=list semicolon-delimited list of accepted domains
    • –exclude-domains=list semicolon-delimited list of unacceptable domains
    • –FOLLOW-FTP tracking FTP links in HTML documents
    • –follow-tags=list semicolon-delimited list of tracked HTML tags
    • -g,–ignore-tags=list semicolon-delimited list of ignored HTML tags
    • -h,–span-hosts go to external host when recursion
    • -l,–relative only tracks relative links
    • -i,–include-directories=list List of allowed directories
    • -x,–exclude-directories=list is not included in the list of directories
    • -np,–no-parent do not trace back to the parent directory
    • Wget-s–spider URL does not download only display procedure

4. Use instance:

Example 1: Download a single file using wget

Command:

wget Http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Description

The following example downloads a file from the network and saves it in the current directory, and a progress bar is displayed during the download, including (Percent download complete, bytes already downloaded, current download speed, remaining download time).

Example 2: Use Wget-o to download and save with a different filename

Command:

: Wget-o wordpress.zip http://www.minjieren.com/download.aspx?id=1080

Description

wget defaults to the last character that matches the "/" command, and the download for dynamic links usually has incorrect file names.

Error: The following example downloads a file and saves it with the name download.aspx?id=1080

wget http://www.minjieren.com/download?id=1

Even if the downloaded file is in zip format, it is still in the download.php?id=1080 command.

Correct: To solve this problem, we can use parameter-o to specify a filename:

Wget-o Wordpress.zip http://www.minjieren.com/download.aspx?id=1080

Example 3: Use wget–limit-rate speed limit download

Command:

wget--limit-rate=300k Http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Description

When you perform wget, it defaults to the full range of possible broadband downloads. But when you're ready to download a large file and you need to download other files, it's necessary to speed it down.

Example 4: Using a wget-c breakpoint to continue the pass

Command:

Wget-c Http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Description

Using Wget-c to restart download-interrupted files is very helpful for us to download large files suddenly because of outages such as the network, and we can continue downloading instead of downloading a file again. You can use the-c parameter when you need to continue interrupting downloads.

Example 5: Using wget-b background download

Command:

Wget-b Http://www.minjieren.com/wordpress-3.1-zh_CN.zip

Description

For downloading very large files, we can use parameter-B for background downloading.

Wget-b Http://www.minjieren.com/wordpress-3.1-zh_CN.zip
Continuing in background, PID 1840.
Output would be written to ' Wget-log '.

You can use the following command to view the download progress:

Tail-f Wget-log

Example 6: Camouflage Agent Name Download

Command:

wget--user-agent= "mozilla/5.0" (Windows; U Windows NT 6.1; En-US) applewebkit/534.16 (khtml, like Gecko) chrome/10.0.648.204 safari/534.16 "http://www.minjieren.com/ Wordpress-3.1-zh_cn.zip

Description

Some sites can reject your download request by judging that the proxy name is not a browser. But you can disguise it by –user-agent parameters.

Example 7: Using the Wget–spider test download link

Command:

wget--spider URL

Description

When you plan to do a timed download, you should test that the download link is valid at the scheduled time. We can increase the –spider parameters for inspection.

wget--spider URL

If the download link is correct, it will display

wget--spider URL
spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response ... OK
length:unspecified [text/html]
Remote file exists and could contain further links,
but recursion is di Sabled--not retrieving.

This ensures that the download can be done at the scheduled time, but when you give a wrong link, the following error will be displayed

wget--spider URL
spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response ... 404 Not Found
Remote file does not exist--broken link!!!

You can use the spider parameter in the following situations:

    • Check before scheduled download
    • Interval detects whether a Web site is available
    • Check the dead link of the site page

Example 8: Increasing the number of retries with wget–tries

Command:

wget--tries=40 URL

Description

If there is a problem with the network or downloading a large file, it may fail. wget the connection download file by default retry 20 times. If necessary, you can use –tries to increase the number of retries.

Example 9: Downloading multiple files using wget-i

Command:

Wget-i filelist.txt

Description

First, save a copy of the download link file

Cat > Filelist.txt
url1
url2
url3
url4

Then use this file and parameter-I to download

Example 10: Using the Wget–mirror mirror Web site

Command:

wget--mirror-p--convert-links-p./local URL

Description

Download the entire Web site to local.

    • –miror: Open account Mirror Download
    • -P: Download all files that appear to be normal for HTML pages
    • –convert-links: After downloading, convert the cost to the link
    • -P./local: Save all files and directories to a locally specified directory

Example 11: Using Wget–reject filter to download the specified format

Command:

wget--reject=gif ur

Description

Download a website, but you do not want to download pictures, you can use the following command.

Example 12: Use Wget-o to store download information in a log file

Command:

Wget-o Download.log URL

Description

Do not want the download information to be displayed directly at the terminal but rather in a log file that can be used

Example 13: Use Wget-q to limit total download file size

Command:

Wget-q5m-i filelist.txt

Description

When you want to download the file more than 5M and quit the download, you can use. Note: This parameter does not work on a single file download and is only valid if it is recursive for downloading.

Example 14: Download the specified format file using Wget-r-A

Command:

Wget-r-a.pdf URL

Description

You can use this feature in the following situations:

    • Download all pictures of a Web site
    • Download all videos of a website
    • Download all PDF files for a Web site

Example 15: Using wget FTP download

Command:

wget ftp-url
wget--ftp-user=username--ftp-password=password URL

Description

You can use wget to complete the download of the FTP link.

Using wget anonymous FTP download:

wget Ftp-url

FTP download using wget user name and password authentication

wget--ftp-user=username--ftp-password=password URL

Remarks: Compiling installation

Compile the installation using the following command:

# tar ZXVF wget-1.9.1.tar.gz 
# cd wget-1.9.1 # 
./configure 
# Make 

The above is the entire content of this article, I hope to help you learn, but also hope that we support the cloud habitat community.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.