WebFetch is a micro crawler that can run on mobile devices, without relying on minimalist web crawling components.
WebFetch objectives to be achieved:
No third-party dependent jar packages
Reduce memory usage
Increase CPU Utilization
Speed up network crawls
Simple and straightforward API interface
Can be run stably on Android devices
Small and flexible web crawler components for easy integration
Working with Documents
WebFetch is very simple to use, let small white users quickly get started, WebFetch for the user to configure the default page processing method, the default will crawl to the page information using System.out.print output to the console (by configuring Pagehandler Modify the default action).
Startup code:
WebFetch WebFetch = new WebFetch (); Webfetch.addbegintask ("https://github.com"). Start ();
Stop code:
Webfetch.close ();
WebFetch after the start () method does not block program execution, you can join multiple Web page addresses, currently support HTTP and HTTPS, at least one start address is required.
The first version of the need for continuous improvement and improvement, I hope you can put forward valuable suggestions for improvement, thank you for your support.
Contact information: [Email protected]
Hexleo/webfetchStar 43| Fork A micro crawler that can run on a mobile device, without relying on minimalist web crawl components.
Issues:
- #1 new version v0.1.x-improved Hexleo3 months ago
recently submitted:
- 15ac3982d add Example readme.md Hexleo3 months ago
- BCD1F8FC4 add WebFetch Example Hexleo3 months ago
- 5DE1B51DC mod readme.md hexleo3 months ago
Download zipMaster Branch code last update: 2015-05-25
WebFetch is a non-dependent minimalist web crawling component