Incremental crawler, Vertical crawler

Source: Internet
Author: User

2. Incremental crawler (Incremental Crawler): incremental crawler and batch crawler different, will maintain continuous crawl, for crawling to the webpage, to be updated regularly, because the Internet's Web pages are constantly changing, new pages, Web pages are deleted or Web content changes are common, and incremental crawlers need to reflect this change in a timely manner, so in the continuous crawl process, not crawling new pages, is to update existing Web pages. Generic commercial search engine crawlers are basically this category.

3. Vertical crawler (Focused crawter): vertical crawler focus on specific topics or industry-specific pages, for example, for health sites, only need to find health-related page content from the Internet page, other industry content is not considered. One of the biggest features and difficulties of vertical crawler is how to identify whether the Web content belongs to a specific industry or topic. From the point of view of saving system resources, it is not likely to download all the Internet pages after the screening, so waste of resources is too much, often need crawler in the crawl stage to dynamically identify whether a URL is related to the theme, and try not to catch the pier unrelated pages, in order to achieve the purpose of saving resources. Vertical search sites or vertical industry sites often require this type of crawler.

Incremental crawler, Vertical crawler

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.