/**
* Jsoup is a Java HTML Parser that can directly parse a URL address and HTML text content. It provides a set of very labor-saving APIs that can be used to retrieve and operate data through DOM, CSS, and operations similar to jquery.
The main functions of jsoup are as follows:
1. parse HTML from a URL, file, or string;
2. Use the Dom or CSS selector to find and retrieve data;
3. HTML elements, attributes, and text can be operated;
Jsoup is released based on the MIT protocol and can be safely used in commercial projects.
**/
Online javadoc:Http://www.ostools.net/apidocs/apidoc? API = jsoup-1.6.3
Jsoup cookbook: http://www.open-open.com/jsoup/
Next let's take a look at an instance to get the website http://www.menneske.no/arukone/5x5/eng? Number = position of the number in the Table in 499
public static void main(String[] args) throws IOException { Document doc = Jsoup.connect("http://www.menneske.no/arukone/5x5/eng/?number=499").get(); Elements contents = doc.getElementsByClass("arukone"); Elements datas = contents.get(0).getElementsByTag("table"); for (Element data : datas) { Elements trs=data.getElementsByTag("tr"); for (int i = 0; i<trs.size(); i++) { Elements tds = trs.get(i).getElementsByTag("td"); for(int j = 0; j<tds.size(); j++){ if(!"".equals(tds.get(j).text())){ System.out.println(tds.get(j).text()+","+i+","+j); } } } } }