HTML Parser is used in Android applications to easily parse HTML content.

Source: Internet
Author: User


With the development of mobile Internet, more content needs to be extended from traditional Internet to mobile terminals. There are three common methods: 1. web app: using HTML5 technology, such as jquery mobile and dojox mobile, to optimize the web page on the server side. 2. Hybrid app: the app is generated using HTML5 technology and frameworks such as phonegap. You can directly call the API of the mobile phone operating system through phonegap, such as sensors and rings. 3. Original Ecological app: Download the content to be displayed to the local device, relay and display the content after parsing.

The advantages and disadvantages of the three mobile applications have been compared in many articles. The biggest advantage of web app and hybrid app is cross-platform, which is the best choice for controlling developers' costs, but its disadvantages are also obvious. Its user experience on mobile terminals is inferior to that on the original app. The author believes that enterprise users do not have high requirements on user experience, but use mobile terminals to complete the corresponding business processes. Therefore, using web apps or hybrid apps can reduce development costs, but for individual users, the user experience will be more picky, so that the use of the original ecological app can be competitive in a large number of applications.

This article will focus on the original ecological app. In order to better display the webpage content, HTML Parser is used to extract, parse the webpage content and relay the content in the Android app, finally, an example is provided to illustrate the use of HTML Parser in Android.

Back to Top

Use of HTML Parser in Android

Htm parser is an open source project used to parse HTML documents. It provides powerful APIs to convert HTML webpage information (transformation) and extract (extraction) from HTML documents) information of interest. It is compact, fast, and easy to use and has undergone rigorous tests.

In Java EE applications, you can directly download htmlparser. jar from the HTML Parser homepage and import it to build path, that is, you can use the API defined in htmlparser. jar. However, this method is not suitable for Android projects, because Android uses the Dalvik Virtual Machine and htmlparser. jar compilation is completed on the traditional Oracle Java virtual machine. jar cannot be used directly after being imported into the android project.

The following example shows how to use HTML Parser In the android project, that is, to create the Library Project of HTML Parser and import it to the Android Application to be referenced:

  1. Download the source code of HTML Parser from;
  2. Create an android project named myhtmlparser. Import the source code of HTML Parser to this project and remove the sample code and unit test code from the source code;
  3. Set the myhtmlparser project to a library project. Method: Right-click the project and choose Properties> Android. Select the is libaray option;
  4. Compile the myhtmlparser project;
  5. Import the myhtmlparser project to the project that requires HTML Parser. You can use the APIS provided by HTML Parser.

Figure 1. Create a myhtmlparser Project

Figure 2. Import the source code of HTML Parser

Figure 3. Set the myhtmlparser project to a library project

Back to Top

Application Example of HTML Parser in Android

This article will describe how to use HTML Parser in Android projects by analyzing the latest recommended articles on the developerworks homepage. In this example, create an Android app, import the HTML Parser Library Project, use HTML Parser to obtain information on the developerworks homepage, and parse the latest recommended Article names on the developerworks homepage, display the list of recommended articles in a listview. (Source Code)

Figure 4. List of the latest recommended articles on the developerworks Homepage

The implementation steps are as follows:

1. Create a sample project named dwparser and import myhtmlparser as the Library to the project. Right-click the project-> properties-> Android, and add myhtmlparser as the reference project.

Figure 5. Import myhtmlparser to the dwparser Project

2. Open the manifest file in the project and add the permission to declare network access. Method: addUses-Permission node, and setAndroid: the property value of name is Android. Permission. Internet.

Listing 1. Add the permission to declare network access to the manifest File

         <manifest xmlns:android="" package="com.example.androidtest"    android:versionCode="1"    android:versionName="1.0">      <uses-sdk android:minSdkVersion="8" android:targetSdkVersion="15" />   <application android:label="@string/app_name"       android:icon="@drawable/ic_launcher" android:theme="@style/AppTheme">     </application>     <uses-permission android:name="android.permission.INTERNET"/> </manifest> 

3. Add the listview control to the layout file to store the list of the latest recommended articles on the parsed developerworks homepage. Modify the mymainactivity type file so that it inherits from listactivity.

Listing 2. Add the listview control to the layout File

         <RelativeLayout xmlns:android=""    xmlns:tools=""    android:layout_width="match_parent"    android:layout_height="match_parent" >    <ListView android:id="@android:id/list"      android:layout_width="fill_parent"       android:layout_height="wrap_content"      android:layout_weight="1"       android:layout_marginLeft="20dp"      android:layout_marginRight="20dp"      android:background="@android:color/transparent"      />  </RelativeLayout> 

Listing 3. Create a listview and modify an object of the mymainactivity type

Public class mymainactivity extends listactivity {private arrayadapter <string> adapter; @ override public void oncreate (bundle savedinstancestate) {super. oncreate (savedinstancestate); setcontentview (R. layout. activity_main); List <string> posttitlelist = new arraylist <string> (); try {// obtain the list of recommended Article topics through htmlparser and store them in posttitlelist = parserdwpost ();} catch (parserexception e) {e. prin Tstacktrace () ;}// initialize the listview adapter to display adapter = new arrayadapter <string> (this, R. layout. dw_post_item) on the android interface; If (posttitlelist! = NULL & posttitlelist. size ()> 0) {for (String title: posttitlelist) {// display the content in posttitlelist in listview adapter. add (title) ;}} setlistadapter (adapter );}}

4. Open the source code of the developerworks homepage.

  • The list of the latest recommended articles is included in the "<Div id =" tab1 ">" connection tag.
  • Each article is included in a "<li>" tag.
  • The topic is a connection tag.

In this way, you can find the "<Div id =" tab1 ">" tag in HTML Parser and directly use the HTML Parser interface to query the connected text in the "<li>" tag. For details about how to use HTML Parser and interface instructions, see Appendix resources.

Listing 4. HTML Parser web page implementation code:

Private list <string> parserdwpost () throws parserexception {final string dw_home_page_url = ""; arraylist <string> ptitlelist = new arraylist <string> (); // create an HTML Parser object, and specify the URL and encoding format htmlparser = new Parser (dw_home_page_url); htmlparser. setencoding ("UTF-8"); string posttitle = ""; // get the specified Div node, that is, the <div> label, the tag contains the attribute id value "tab1" nodelist divoftab1 = HT Mlparser. extractallnodesthatmatch (New andfilter (newtagnamefilter ("Div"), new hasattributefilter ("ID", "tab1"); If (divoftab1! = NULL & divoftab1.size ()> 0) {// obtain the <li> node nodelist itemlilist = divoftab1.elementat (0) in the subnode of the specified Div label ). getchildren (). extractallnodesthatmatch (New tagnamefilter ("Li"), true); If (itemlilist! = NULL & itemlilist. size ()> 0) {for (INT I = 0; I <itemlilist. size (); ++ I) {// obtain the link node nodelist linkitem = itemlilist from the child node of the <li> node. elementat (I ). getchildren (). extractallnodesthatmatch (New nodeclassfilter (linktag. class), true); If (linkitem! = NULL & linkitem. size ()> 0) {// obtain the text of the link node, that is, the title text of the Recommendation article to be obtained is posttitle = (linktag) linkitem. elementat (0 )). getlinktext (); system. out. println (posttitle); ptitlelist. add (posttitle) ;}}} return ptitlelist ;}

5. Run the android project. The listview displays the latest articles recommended on the developerworks homepage. Each listitem displays a topic of the Recommendation article.

Figure 6. display the latest recommended articles on the developerworks homepage in the Android app

Back to Top


This article describes how to use HTML Parser in Android applications to parse webpage content, and use the library project in Android to download and import the HTML Parser source code to an android library project, import the HTML Parser Library Project in the project where HTML Parser is needed, and demonstrate the usage and basic functions through a small example.



  • For more information, see the developerworks article.

  • See the HTML Parser project homepage to learn about the HTML Parser open-source project.

  • Download HTML Parser from the source code of HTML Parser.

  • For details about Android content, refer to the topic: From getting started to proficient in Android development.

  • Developing a Google Map-based Android app is an article on mobile maps.

  • Stay tuned to developerworks technical events and network broadcasts.

  • Visit the developerworks open source area to get a wealth of how-to information, tools and project updates, as well as the most popular articles and tutorials, to help you develop with open source technology and integrate them
    IBM products are used in combination.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.