Atitit. Data Logger dataspider
/atiplat_cms/src/com/attilax/webinfox.java @dep
http://cl.cmcher.com/thread0806.php?fid=16&search=&page=2
/atiplat_cms/src/com/attilax/dataspider/tsaolyonetdataspider.java
Crawler Considerations
setting useragent as FF
Note for https:
mainly because of Java their httpurlconnection to SSL support is not good, and control is not convenient, and httpclient also support to crawl non-trusted sites, other implementations seem to need to explicitly import the certificate in the code.
author:: Nickname :Old Wow's claws( Full Name::AttilaxAkbar Al Rapanui Attilaksachanui) Kanji Name: Ayron, email:[email protected]
reprint Please indicate source: http://www.cnblogs.com/attilax/
requires override to play three functions
Public List<String> Getpageurls()
Public List getartlistbypagehtml(String HTML) {
public list < Span style= "Font-family:consolas; Color: #d7837f; font-size:12.0000pt; Background: #000000; " >string getpics_byhtml
using methods and parameters
Tsaolyonetdataspider x = New Tsaolyonetdataspider();
//X.filename= args [0];//] c:\\r2. CSV ";
x . Picsavedir = "C:\\0picsavedir";
x.StartPage = Integer.parseint(System.GetProperty("StartPage", "1"));
x.EndPage = Integer.parseint(System.GetProperty("EndPage"));
;
x. exec ();
Atitit. Data Logger Dataspider