Medical Education web crawler--Website Walk (live)

Source: Internet
Author: User

In front of me in a it live over the www.med66.com landing process. Blog: http://my.oschina.net/hevakelcj/blog/357852

Successful landing means entering the portal of the website. The rest of the job is to go inside and take the thought out of it.

The following is a successful landing page, we need to get a list of courses from this page.

Open the Firefox debugging tool and see how the above elements are laid out.

It is easy to find the elements of the course list through the Firefox debugging tool, with all the courses listed in <div class= "Ul_con_uc_show" >.
And every <div class= "Uc_row" > is a course.
There is a link to "Click here to learn from the beginning" of each course. As above href= "http://elearning.med66.com/cware/video/videoList/videoList.shtm?cwareID=700914"

Let's analyze this link address to access the pinned page http://elearning.med66.com/cware/video/videoList/videoList.shtm
With a parameter cwareid=700914 in the back. This "700914" is the ID number of the course.

Go to the download page for this course:

On this "Download Center" page you can download handouts, exercises, videos, and more. I was surprised to find that the address of the Download Center is related to the course ID:
http://elearning.med66.com/cware/download/downloadIndex.shtm?cwareID=700914
This URL is also a fixed page address, followed by a parameter cwareid=700914.
The author boldly Imagine, is not all the course download page is to Cwareid to distinguish course?

The author opens the link "course handout Word document Download" on the "Download Center" page. Observe its address:
http://elearning.med66.com/cware/download/wordDownload.shtm?wordType=1&cwareID=700914

The author then opens the "Practice Center Word document Download" and observes its address:
http://elearning.med66.com/cware/download/wordDownload.shtm?wordType=2&cwareID=700914

It can be seen that the two are only wordtype this parameter is different. Extrapolate, the author shows in tabular form:

Download content
Download link
Handout http://elearning.med66.com/cware/download/wordDownload.shtm?wordType=1&cwareID=700914
Practice http://elearning.med66.com/cware/download/wordDownload.shtm?wordType=2&cwareID=700914
Mobile video
http://elearning.med66.com/cware/download/videoDownload.shtm?cwareDownType=down12&cwareID=700914
Phone Audio
http://elearning.med66.com/cware/download/videoDownload.shtm?cwareDownType=down13&cwareID=700914
Flat-screen Video
http://elearning.med66.com/cware/download/videoDownload.shtm?cwareDownType=down14&cwareID=700914
Flat Panel Audio
http://elearning.med66.com/cware/download/videoDownload.shtm?cwareDownType=down15&cwareID=700914

With this watch, if you know the Cwareid of the course, the address of the resource to download the course is deduced. It's a big break!

Although it is possible to access the download page of a resource, the download page is not just a download, but a section of a section below. The following is the "mobile video" download page:

Use the Firefox debugging tool to open the layout of the elements:

We want to capture each row of the table, grab the name of the section and the Resource link address. It has more than one, there are 4 to choose from. Let's use the third one (which should be the least busy).

All right, today we'll analyze the process and analyze it tomorrow. Please look forward to the best!

Medical Education web crawler--Website Walk (live)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.