Download the entire python website.

Source: Internet
Author: User

Download the entire python website.

Download the entire website tool using python.

The core process is simple:

1. Enter the website address

2. url to get the response content.

3. According to the http packet header of the response, if the type is html, the process starts from step 1. If it is another type, it will be executed from step 1.

4. Extract the href and src attribute values in html.

5. Add the extracted url to the download queue. If the url already exists in the download queue, discard it.

6. Then open the next url in the url queue.

7. Continue to step 1 of the loop, until the url in the url queue is processed.

This step looks simple, but it takes a long time to process many details.

For various types of URLs, how do I name URLs with question marks.

Currently, this program has one problem:

1. When a url is opened, it may be blocked and cannot be executed in one place. Urllib. request

2. When the url queue is too long, the multi-thread download speed is faster.

3. I do not know how many errors there are. When writing comments, if you use Chinese, you need to switch back and forth the input method, so you can use English.

Currently, the Program Department supports multithreading and will be improved in the future.

If you are interested in perfection, we are very welcome.

Source code download: http://download.csdn.net/detail/jiangxiaoma111/8002631

Personal email: 369806726@qq.com


How does python download multiple files from a website?

Generally, you can use a download tool. For example, downlthemall is used in firefox.
 
Python download

Www.python.org/ftp/python/
 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.