The grasping module of a Health Network health care column by using the Knight Station Group System (I.)

Source: Internet
Author: User

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

In A5 my permission is only A1, one day can upload only 3 pictures, contact Knight Station Group Management system few days, belong to novice new novice, write this purpose is to complete the task, the second is to record their own operation process, like notes, write posts, it needs more pictures, so that after their own memory is not clear, Turn over the post also know how to operate, if only 3 pictures, I worry about not clear, so, thank the Knight's customer service-cocoa beauty classmate, help me pass pictures, thank you.

After www.xiake5.com downloaded the free version of the Knight Station group, through these days, the use of Chivalrous station group to see, Knight is set to crawl take, article processing, automatic release and one of the large-type software, but their own experience to do the station told me, if you want to improve efficiency, only first from their own content on the site to start, In the module market of the Chivalrous Station group, there are a large number of distribution module supply options, so, I am more concerned about the direction of the capture module, is how to better crawl to the content of their own site, this post, I feel that they need to be able to understand more in-depth role of the knight to do, the results are satisfactory, The magic of a warrior is not just a simple visual extraction, but even a stunning performance on a regular crawl, artifact-like crawl effect, let a person tongue-tied, accurate extraction exciting unceasingly, through the tutorial, I try to use the Content List page to start using regular extraction links, but also let me small success, Secretly, I will carefully through the picture demo, the 39 Health Network of male Health section of the module through a custom crawl to build up, through, in, the next three tutorials, respectively corresponding to the custom crawl process of three steps namely:

Process 1: Crawl list link, process 2: Get Content Link, process 3: Content get link, following figure

Among them: Process 1, Flow 2 for visual extraction mode, Flow 3 is to adopt regular + visual extraction mode.

Operation Steps:

1, access to the 39 Health Network Content List page, must be such a form of the article List mode, the following with "1.2.3~~~" pagination page, while the right page, view the source file, encoded as: charset=gb2312

  

This is the first page of the diagram:

  

This is the second page of the diagram:

  

This is the last picture:

  

The common denominator of the three charts is: HTTP://MAN.XXX.NET/NXBJ/BJCS This is the entry point,

The difference between three charts is that the index is not the same, the first page does not, the second page for index_2.html, the last is index_97.html, that is, from 2 to 97, plus 1 of the order of the sequence, encoded as charset=gb2312 "

The similarities and differences above remember, in the production process 1 will be used.

2, click to create a new module-New grab module, pop-up above

3, choose "Custom Mode"

4, click "Process 1: Get list link", Pop-up:

  

Select: Default encoding (GBK,GB2312), entry address fill: HTTP://MANXX.NET/NXBJ/BJCS

5, click on the above figure "Paging extraction rules", in the pop-up page, as shown in the picture:

  

"Extract engine: Select based on visual engine extraction

"Extract Code: Default encoding (GBK,GB2312)

"Extract mode: Automatically generate Links"

The beginning of the result insert: http://manxx.net/nxbj/bjcs/index_

At the end of the result insert:. html

End of: From 2 to 97, increment 1

Other does not move, this actually is to generate 2-96 pages, the two pages give up, theoretically 2-98 can also, but did not try, hehe

Then click on the "Test global rules" on the right to test the address can ignore, see the following extraction results, come out, a step by step to save the data, so that process 1: Get list link has been completed.

Summary, this operation is based on the tutorial to extract the Sohu woman page, this is called JS paging, looks like the visual way to extract is the best, must choose the automatic generation of links, this is in the source file can not see, my experience is that if the source file can not see the best to take the visual way, As long as the source file can be seen in the use of regular, the future process 2 and 3 will experience the knight is the power of extraction, although awkward, although difficult to understand, but effective, here the best new students to go to see more tutorials, official online very detailed, a few days ago in the introduction, there are tutorials address and so on, We can go over the post, my logical ability is very low, advised you, if not a professional computer professional, do not go to the mother to see the regular professional tutorials, after reading will be confused to regret the collapse of

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.