The previous article introduced the Java Code for httpurlconnection Access Web pages
This article describes the Jsoup visit Web page
First go to official website https://jsoup.org/download Download Jsoup-1.11.2.jar
Import into Project
Create a new class Jsoupcrawler
Write the following code
PackageOrg.apache.crawlerType;Importjava.io.IOException;ImportOrg.jsoup.Jsoup;Importorg.jsoup.nodes.Document; Public classJsoupcrawler { Public Static voidMain (string[] args) {Try{Document doc=jsoup.connect ("http://www.cnblogs.com/szw-blog/p/8565944.html"). Timeout (1000)//setting the time-out period. useragent ("mozilla/5.0" (Windows NT 6.1; Win64; x64; rv:58.0) gecko/20100101 firefox/58.0 ")//Set Browser request Header. Header ("Accept-language", "zh-cn,zh;q=0.8,zh-tw;q=0.7,zh-hk;q=0.5,en-us;q=0.3,en;q=0.2")// Set the request header . get (); System.out.println (Doc.tostring ()); } Catch(IOException e) {e.printstacktrace (); } }}
The result after the run is
The above is the Java code that jsoup visits the webpage
Web Crawler Starter Series (iii) (Jsoup)