Medical Education web crawler Program (live)

Last Update:2014-12-19 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

12-18

Tonight, I received a phone call from elder sister, said she has been in the "Medical Education network" ordered a lot of video, I want to help her to download all the video down.
I looked at it, there are 24 subjects, there are more than 40 sections per subject. If I had to do it manually, I might as well let me die.
This duplication of things or let the program do it! Here is the process of writing a live blog.

Crawled URL: http://www.med66.com/

A few days ago I just finished a qihuiwang crawler software. This time I evaluated, this time to do the video download crawler than the last time there are new challenges:

(1) to deal with the landing process, the previous no need to log on can be directly climbed. It has to be landed this time. Process involving the Post data table

(2) To identify the JavaScript program. I'll take a look at the button on my page that says "onclick=" godownload (' 700914 ', '). This is going to be converted into a URL address

(3) Download need to record which files have been downloaded, so as not to start the program every time the download from the beginning. This is unreasonable.

(4) The documents to be downloaded are organized in the catalogue by course.

The site path is as follows:

Landing page-(login)--Student Course page-(access Course)-Directory page-(Download Center)-download page--section video

All right, let's do it tomorrow.

Medical Education web crawler Program (live)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Medical Education web crawler Program (live)

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support