Python crawls and downloads all the Wheat Academy Video tutorials

Source: Internet
Author: User

First, the main ideas

    1. Scrapy Crawl is a course address and name
    2. Download using the multiprocessing
    3. Just to crawl a bit of video, so it's a simple code stack.
    4. The way you want to share it without practicing it.

Ii. Description of the document

    1. itemsscray field
    2. piplines.py Storage database
    3. setting.py scrapy configuration You need to be aware of default_request_headers settings, need to impersonate the login
    4. mz.py is the main crawler is the basic crawler functions, css+xpath+
    5. start_urls = [ "http://www.maiziedu.com/course/web/" ,] Only crawled web, can be done as needed, or all,
    6. I wanted to not store it in the database, Download directly in mz.py, but considering that it will affect Scrapy's original performance, download it separately
    1. down.py using multiprocessing to download the original thought of dynamic monitoring scrapy in the database results, want to realize the sharing of the process, debugging several times also have problems so directly with the Pool.map () This more rough way,
    2. Mz.json existing JSON, but considering that the JSON file back and forth, affecting efficiency, so instead of the database

iii. Results
    1. Source: Https://yunpan.cn/crjn7J97xUD8F Access Password 6219
    2. Video address: https://yunpan.cn/crjXKLGnkpzPk access password 6C15



From for notes (Wiz)



Python crawls and downloads all the Wheat Academy Video tutorials

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.