Let's take a look at the introduction of Baidu Encyclopedia
It is the HTML parser for Java
When you use HttpClient to get the information you need for a specific page after the page is extracted, use the Jsoup,jsoup to get the data you need, using a powerful similar selector.
To use Jsoup is very simple, build a Java Dynamic Web page project, introduce the relevant jar package, paste the sample code to start development, this is all the development (HelloWorld) common routines.
Two Learning sites:
http://www.open-open.com/jsoup/
https://www.ibm.com/developerworks/cn/java/j-lo-jsouphtml/
Jar Package Download
Official website: https://jsoup.org/
Jsoup Document: Https://jsoup.org/cookbook/introduction/parsing-a-document
However, you may be unable to access the situation, you can download FQ, or download from the domestic download station.
Baidu to download the jar package in the country where it can be downloaded
We can then introduce the jar package in the project.
Jsoup Study and use