WebFetch is a non-dependent minimalist web crawling component

Source: Internet
Author: User

WebFetch is a micro crawler that can run on mobile devices, without relying on minimalist web crawling components.

WebFetch objectives to be achieved:

    • No third-party dependent jar packages

    • Reduce memory usage

    • Increase CPU Utilization

    • Speed up network crawls

    • Simple and straightforward API interface

    • Can be run stably on Android devices

    • Small and flexible web crawler components for easy integration

Working with Documents

WebFetch is very simple to use, let small white users quickly get started, WebFetch for the user to configure the default page processing method, the default will crawl to the page information using System.out.print output to the console (by configuring Pagehandler Modify the default action).

Startup code:

WebFetch WebFetch = new WebFetch (); Webfetch.addbegintask ("https://github.com"). Start ();

Stop code:

Webfetch.close ();

WebFetch after the start () method does not block program execution, you can join multiple Web page addresses, currently support HTTP and HTTPS, at least one start address is required.

The first version of the need for continuous improvement and improvement, I hope you can put forward valuable suggestions for improvement, thank you for your support.

Contact information: [Email protected]

Hexleo/webfetchStar 43| Fork A micro crawler that can run on a mobile device, without relying on minimalist web crawl components. Issues:
    • #1 new version v0.1.x-improved Hexleo3 months ago
recently submitted:
    • 15ac3982d add Example readme.md Hexleo3 months ago
    • BCD1F8FC4 add WebFetch Example Hexleo3 months ago
    • 5DE1B51DC mod readme.md hexleo3 months ago
Download zipMaster Branch code last update: 2015-05-25

WebFetch is a non-dependent minimalist web crawling component

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.