Wikiscraper.java
PackageMaster.haku.scrape;ImportOrg.jsoup.Jsoup;Importorg.jsoup.nodes.Document;Importjava.net.*;ImportJava.io.*; Public classWikiscraper { Public Static voidMain (string[] args) {scrapetopic ("/wiki/python"); } Public Static voidscrapetopic (string url) {string HTML= GetUrl ("https://en.wikipedia.org" +URL); Document Doc=jsoup.parse (HTML); String ContentText= Doc.select ("#mw-content-text > P"). First (). text (); System.out.println (ContentText); } Public Staticstring getUrl (string url) {URL urlobj=NULL; Try{urlobj=Newurl (URL); } Catch(malformedurlexception e) {System.out.println ("The URL was malformed!"); return""; } urlconnection Urlcon=NULL; BufferedReader in=NULL; String Outputtext= ""; Try{Urlcon=urlobj.openconnection (); Inch=NewBufferedReader (NewInputStreamReader (Urlcon.getinputstream ())); String Line= ""; while(line = In.readline ())! =NULL) {Outputtext+=Line ; } in.close (); } Catch(IOException e) {System.out.println ("There was a error connecting to the URL"); return""; } returnOutputtext; }}
Operation Result:
A python is a constricting snake belonging to the python (genus), or, more generally, any snake in the family Pythonidae ( Containing the Python genus).
Java web crawler-a simple crawler example