wget download the entire site, or a specific directory

Source: Internet
Author: User
Tags save file

You need to download all the files under a directory. The command is as follows

Wget-c-r-np-k-l-p www.xxx.org/pub/path/

At download time. Useful for images or connections to external domain names. Use the-h parameter if you need to download it at the same time.

Wget-np-nh-r--span-hosts www.xxx.org/pub/path/

-C Breakpoint continued to pass
-r recursive download, download all files in a directory (including subdirectories) of a specified Web page
-nd do not create a layer of directories on a recursive download, and download all files to the current directory
-NP recursive download does not search the upper directory, such as Wget-c-R www.xxx.org/pub/path/
Without the parameter-np, the other files in the previous directory of path will be downloaded at the same time.
-K Converts an absolute link to a relative link, downloads the entire site, and then browses the page offline, preferably with this parameter
-L does not enter other hosts when recursion, such as Wget-c-R www.xxx.org/
If there is one such link in the site:
www.yyy.org, without parameter-L, will be like a fire burning mountain, will be recursive download www.yyy.org website
-P Download All files, tablets, etc. required for Web pages
-a specifies a list of file styles to download, with multiple styles separated by commas
-I followed by a file that indicates the URL to download in the file

There are other uses, I search from the Internet, also write up, easy to use later on their own.

Common uses of wget

wget format for use
Usage:wget [OPTION] ... [URL] ...

* Do site mirroring with wget:
Wget-r-p-np-k http://dsec.pku.edu.cn/~usr_name/
# or
Wget-m http://www.tldp.org/LDP/abs/html/

* Download a partially downloaded file on an unstable network and download it during idle hours
Wget-t 0-w 31-c http://dsec.pku.edu.cn/BBC.avi-o Down.log &
# or read the list of files to download from filelist
Wget-t 0-w 31-c-B ftp://dsec.pku.edu.cn/linuxsoft-i filelist.txt-o
Down.log &

The above code can also be used to download during periods when the network is relatively idle. My usage is: in Mozilla will not be convenient to download the URL link is copied into memory and then pasted into the file Filelist.txt, in the evening to go out of the system before the execution of the above code of the second article.

* Download with Agent
Wget-y On-p-K https://sourceforge.net/projects/wvware/

The agent can be set in the environment variable or the WGETRC file

# Set the proxy in the environment variable
Export proxy=http://211.90.168.94:8080/
# set up a proxy in ~/.wgetrc
Http_proxy = http://proxy.yoyodyne.com:18023/
Ftp_proxy = http://proxy.yoyodyne.com:18023/

wget Category List of various options

* Start

-v,–version show wget version and exit
-H,–HELP Print Syntax Help
-b,–background Boot to background execution
-e,–execute=command
Execute the '. Wgetrc ' Format command, WGETRC format see/ETC/WGETRC or ~/.WGETRC

* Record and input files

-o,–output-file=file Write records to file
-a,–append-output=file Append records to File
-d,–debug Print Debug output
-q,–quiet Quiet mode (no output)
-v,–verbose Verbose mode (this is the default setting)
-nv,–non-verbose Turn off verbose mode, but not quiet mode
-i,–input-file=file download URLs that appear in file files
-f,–force-html treats the input file as an HTML format file
-b,–base=url the URL as the relative link prefix that appears in the file specified by the-f-i parameter
–sslcertfile=file Optional Client certificate
–sslcertkey=keyfile Optional Client certificate keyfile
–EGD-FILE=FILE Specifies the file name of the EGD socket

* Download

–bind-address=address
Specify local use Address (host name or IP, used when there are multiple IPs or names locally)
-t,–tries=number sets the maximum number of attempts to link (0 means no limit).
-o–output-document=file writing documents to file files
-nc,–no-clobber do not overwrite existing files or use. #前缀
-c,–continue then download the files that are not downloaded
–progress=type Setting the Process bar flag
-n,–timestamping do not download files again unless newer than local files
-s,–server-response the print server response
–spider don't load anything.
-t,–timeout=seconds setting the number of seconds for response timeout
-w,–wait=seconds interval between two attempts SECONDS seconds
–waitretry=seconds wait between Relink 1 ... Seconds sec
–random-wait wait between downloads 0 ... 2*wait sec
-y,–proxy=on/off turning the agent on or off
-q,–quota=number setting capacity Limits for downloads
–limit-rate=rate Limit Download Transmission rate

* Catalogue

-nd–no-directories do not create a directory
-x,–force-directories forcing a directory to be created
-nh,–no-host-directories do not create a host directory
-p,–directory-prefix=prefix save file to directory prefix/...
–cut-dirs=number Ignore number layer remote directory

* HTTP Options

–http-user=user set the HTTP username to user.
–http-passwd=pass set the HTTP password to pass.
-c,–cache=on/off allows/does not allow server-side data caching (typically allowed).
-e,–html-extension Save all text/html documents with an. html extension
–ignore-length Ignore ' content-length ' header fields
–header=string inserting strings in headers string
–proxy-user=user set the user name of the agent
–proxy-passwd=pass set the password for the agent to PASS
–referer=url include ' Referer:url ' header in HTTP request
-s,–save-headers saving HTTP headers to a file
-u,–user-agent=agent the name of the proxy is agent instead of wget/version.
–no-http-keep-alive Close the HTTP activity link (forever link).
–cookies=off does not use cookies.
–load-cookies=file loading a cookie from a file before starting a session
–save-cookies=file cookies are saved to the file after the session ends

* FTP Options

-nr,–dont-remove-listing do not remove '. Listing ' files
-g,–glob=on/off globbing mechanism for opening or closing filenames
The –PASSIVE-FTP uses the passive transfer mode (the default value).
–active-ftp using active transfer mode
–retr-symlinks the link to the file (not the directory) at the time of recursion

* Recursive download

-r,–recursive recursive download--use with caution!
-l,–level=number maximum recursion depth (INF or 0 for Infinity).
–delete-after Delete files locally after it is finished
-k,–convert-links Convert non-relative links to relative links
-k,–backup-converted back to X.orig before converting file X
-m,–mirror is equivalent to-r-n-l INF-NR.
-p,–page-requisites Download all pictures showing HTML files

* Included and not included in the recursive download (accept/reject)

-a,–accept=list a semicolon-delimited list of accepted extensions
-r,–reject=list semicolon-delimited list of non-accepted extensions
-d,–domains=list a semicolon-delimited list of accepted domains
–exclude-domains=list semicolon-delimited list of domains that are not accepted
–follow-ftp Tracking of FTP links in HTML documents
–follow-tags=list a semicolon-delimited list of tracked HTML tags
-g,–ignore-tags=list a semicolon-delimited list of ignored HTML tags
-h,–span-hosts go to external host when recursion
-l,–relative only tracks relative links
-i,–include-directories=list List of allowed directories
-x,–exclude-directories=list List of directories not included
-np,–no-parent don't go back to the parent directory

wget download the entire site, or a specific directory

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.