Netspider Website Data acquisition software is an open source software based on the. NET platform.
Software part of the function is the basic Soukey software development. This version was developed using vs2010+.net3.5.
Netspider Harvesting currently offers the following main features:
1 Multi-task multi-thread data acquisition, support post mode (pending);
2. can collect Ajax pages;
3. Support cookies, support manual login to collect data;
4. Support for collection transactions;
5. Support Data automatic and manual export, export format: text, Excel, Access, MSSQL, MySQL, etc.
6. Support to publish data online;
7. Support Navigation URL collection, navigation depth is not limited;
8. Support automatic paging;
9. Support file download, can collect pictures, flash and other documents;
10. Support the processing of data acquisition results, including replacement, prefix suffix, interception and other operations, support the regular;
11. The collection URL definition not only supports the basic parameter definition, but also can add the dictionary data as the URL parameter for data collection;
12. Support multi-instance operation of a task;
13. Provide scheduled tasks, scheduled Tasks support Netspider acquisition tasks, external executable tasks, database stored procedure tasks (still in development);
14. The scheduled task execution cycle supports daily, weekly and custom run intervals; The minimum unit is half an hour;
15. Support task trigger to automatically trigger other tasks (including executables or stored procedures) when the acquisition task is complete.
16. Perfect log function: System log, task execution log, error log and so on;
17. The system provides a mini browser to capture cookies or post data;
The Netspider collector does not limit whether you are commercially available, the source is completely open,
=================== The following is the update content ===================================
1. Netspider opened on October 1, 2014
Related Source Download: http://git.oschina.net/kingkoo1985/NETSpider/
1. There is still a lot of verification in this version is not done, there is no time (it took two weeks to write like this), so please fill in the data as required when adding
2. There are also some features that are not implemented. I'll keep improving when I'm free.
My first open source code, Netspider Web Spider Collection Tool