Today (17-03-31) a busy afternoon studying webmagic, found himself too young for such a difficult framework (class library)
Still difficult to accept, or start from the basics, because the relatively basic things tutorial more, so I looked for Apache under the
HttpClient, according to the predecessors sent the tutorial himself also simple write a bit, feel good.
The following implementation is the acquisition of a single page:
Importorg.apache.http.HttpEntity;ImportOrg.apache.http.HttpResponse;Importorg.apache.http.client.HttpClient;ImportOrg.apache.http.client.methods.HttpGet;Importorg.apache.http.impl.client.HttpClients;Importorg.apache.http.util.EntityUtils;Importjava.io.IOException; Public classMain { Public Static voidMain (string[] args)throwsioexception{Try { //Create a client instanceHttpClient client=Httpclients.createdefault (); //Creating an HttpGet instanceHttpGet httpget=NewHttpGet ("http://www.btba.com.cn"); //perform a GET requestHttpResponse response=Client.execute (HttpGet); //returns the Get entityHttpentity entity=response.getentity (); //get Web page content, specify encodingString web= entityutils.tostring (Entity, "UTF-8"); //Output Web pageSystem.out.println (web); } Catch(IOException e) {e.printstacktrace (); } }}
Some shows:
The following is a httpclient download: http://hc.apache.org/downloads.cgi
WebMagic based on HttpClient, Jsoup So, now will these two learned, learned to try to chew webmagic it
The next jsoup for this download page simple parsing processing ...
I am still small white one, above have what insufficient or wrong place please point out, thank you very much.
Get Web content based on Apache-httpclient's small reptile