Recently has been helping the boss to write a crawler, write tired on the thinking to find some fun, happen to usually like to go to the watercress, will write an automatic reply robot, nonsense not to say we enter the topic:
The main use of 2 open source tools: Jsoup and HttpClient
Step 1: Analogue Landing
public static Boolean login () throwsioexception{String captcha_id=downloadpic (Login_url, "D:\\yz.png")///Download the CAPTCHA image to local bufferedreader br = new BufferedReader (newInputStreamReader (system.in)); System.out.println ("Please enter the verification code:"); String Yan =Br.readline (); HttpPost HttpPost = newHttpPost (Login_url); list<namevaluepair> params = new arraylist<namevaluepair>(); Params.add (New Basicnamevaluepair ("Captcha-id"), captcha_id)); /Use Firebug to find Params.add (new Basicnamevaluepair ("Captcha-solution" , Yan));//Verification Code Params.add (new Basicnamevaluepair ("Form_email" , Form_email));//user name Params.add (new Basicnamevaluepair ("Form_password" , Form_password));//Password Params.add (new Basicnamevaluepair ("redir" , redir));//jump address, usually your homepage params.add (new Basicnamevaluepair ("source", "main" )); Params.add (Basicnamevaluepair ("Login", "Login" )), httppost.setentity (New urlencodedformentity (params) ); Closeablehttpresponse response = Httpclient.execute (httppost);//execute this post int statuts_code= Response.getstatusline (). Getstatuscode ()//Get the server return status code if (statuts_code!=302 ) {System.err.println ("Login Failed ~" ); return false ;} else {System.err.println ("Login Successful ~" );} httppost.releaseconnection (); return True ; }
Step 2: Use Firefox's firebug plugin to see what parameters post posts to the server
These are generally the 4 parameters: CK, rv_comment, start, submit_btn
The post code is as follows:
public static BooleanStartpost (String URL) {//parameter URL is the post address
Try{String html=getpagehtml (URL); Pattern p=pattern.compile ("Uh ... What you want is not here.); Matcher m=P.matcher (HTML); If(M.find ()) {return False; } Pattern P3=pattern.compile ("This topic has been set by the group administrator to not allow response"); Matcher m3=P3.matcher (HTML); If(M3.find ()) {return False; } Pattern p2=pattern.compile ("Please enter a word in"); Matcher m2=P2.matcher (HTML); If(M2.find ()) {System.out.println ("to lose verification code ~ Pause for 10 minutes"); Thread.Sleep (600000); return False; } httppost HttpPost = new HttpPost (url+ "Add_comment#last"); Httppost.addheader ("Connection", "keep-alive"); list<namevaluepair> params2 = new Arraylist<namevaluepair> (); Params2.add (New Basicnamevaluepair (" CK "," Xnxg "));//This parameter is important to be sure to use Firebug to view, otherwise can not send paste Params2.add (" Rv_comment ", Getcomment ()));//getcomment () is a function params2.add (new Basicnamevaluepair ("Start", "0" ) that I wrote randomly to return a comment. Params2.add (New Basicnamevaluepair ("Submit_btn", "Plus Go" ), Httppost.setentity (New Urlencodedformentity ( PARAMS2, "Utf-8" )); Closeablehttpresponse response = Httpclient.execute (httppost); int status_code= response.getstatusline () . Getstatuscode (); if (status_code==302 ) {System.out.println ("comment succeeded ~" +url);//comment succeeded } else {System.out.println ("comment failed ~" + URL);//comment failed } httppost.releaseconnection (); Thread.Sleep ();} Catch (Exception e) {return false ;} return True ;
Please check my github:https://github.com/wqpod2g/douban for full code.
Thank you for this post http://www.cnblogs.com/lzzgym/p/3322685.html
Java implementation of the watercress reply robot