Android uses Jsoup to parse html + download images,

Source: Internet
Author: User

Android uses Jsoup to parse html + download images,

Recently, I want to drum up the CSDN client. This blog mainly introduces how to use Jsoup to parse html pages to obtain the required content through tags and download the specified image resources.

1. Import the Jsoup JAR package

JAR package: jsoup 1.6.1

Note: When importing the package to a project, copy all the decompressed jar files to the libs directory. Otherwise, an error is reported during running.




Ii. Download and parse the html page

Code:

Package com. example. testcsdn; import java. io. byteArrayOutputStream; import java. io. IOException; import java. io. inputStream; import java.net. httpURLConnection; import java.net. URL; import java. util. arrayList; import org. jsoup. jsoup; import org. jsoup. nodes. document; import org. jsoup. nodes. element; import org. jsoup. select. elements; import android. util. log;/*** parse the obtained html resources by specifying the link address, and return the encapsulated ArrayList <Blog> object */publ Ic class BlogsFetchr {private static final String TAG = "BlogsFetchr "; /*** download the resource specified by the URL ** @ return returns byte [] **/public byte [] getUrlBytes (String urlSpec) throws IOException {URL url = new URL (urlSpec); HttpURLConnection conn = (HttpURLConnection) url. openConnection (); // mandatory conversion here, because HttpURLConnection is used below. getInputStreamtry {ByteArrayOutputStream out = new ByteArrayOutputStream (); InputStream in = co Nn. getInputStream (); if (conn. getResponseCode ()! = HttpURLConnection. HTTP_ OK) {// Log of connection failure. I (TAG, "connection failed"); return null;} byte [] buffer = new byte [1024]; int len = 0; while (len = in. read (buffer)> 0) {out. write (buffer, 0, len);} out. close (); return out. toByteArray ();} finally {conn. disconnect () ;}}/*** download the resource specified by the URL (convert the returned value byte [] of the getUrlBytes Method to the String type) ** @ return: return type: String */private String getUrl (String urlSpec) {String result = null; try {result = new String (getUrlBytes (urlSpec ));} catch (IOException e) {e. printStackTrace ();} return result;} public ArrayList <Blog> downloadBlogItems (String urlSpec) {ArrayList <Blog> blogs = new ArrayList <> (); string htmlString = getUrl (urlSpec); // parse htmlStringparserItems (blogs, htmlString); return blogs;} private void parserItems (ArrayList <Blog> blogs, String htmlString) {Document doc = Jsoup. parse (htmlString); Elements units = doc. getElementsByClass ("blog_list"); for (int I = 0; I <units. size (); I ++) {Blog blog = new Blog (); Element unit_ele = units. get (I); Element dl_ele = unit_ele.getElementsByTag ("dl "). get (0); Element dl_dt_ele = dl_ele.getElementsByTag ("dt "). get (0); Element dt_a_ele = dl_dt_ele.child (0); String iconUrl = dt_a_ele.child (0 ). attr ("src"); // Log of the master profile. I (TAG, "Post" + I + ":" + iconUrl); Elements fls = unit_ele.getElementsByClass ("fl"); Element fl_ele = fls. get (0); Element fl_a1_ele = fl_ele.child (0); String bloggerId = fl_a1_ele.text (); // The blogger IdLog. I (TAG, "author of article" + I + ":" + bloggerId); blog. setBloggerIconUrl (iconUrl); blog. setBloggerId (bloggerId); blogs. add (blog );}}}

As shown in the code, using Jsoup to parse html is very simple.

You can use a browser, right-click the elements, and get the tool box shown. You can quickly find the tag corresponding to the elements on the page, and then use the Jsoup API to obtain the tag value.




3. Download the specified Image

If you want to download sub-items in the blog list, the blog master's profile picture. You can parse the html to get the image url, and then use HttpURLConnection to download the image directly.

The following creates a ThumbnailDownloader <Token> class that inherits HandlerThread and is used to wait for and process image download requests and update the UI:

Package com. example. testcsdn; import java. io. IOException; import java. util. collections; import java. util. hashMap; import java. util. map; import android. graphics. bitmap; import android. graphics. bitmapFactory; import android. OS. handler; import android. OS. handlerThread; import android. OS. message; import android. support. v4.util. lruCache; import android. util. log; import android. widget. imageView; public class Thumbnai LDownloader <Token> extends HandlerThread {// Token indicates the generic type, and "class name <generic type>" ensures that the Token can be used in the class, just as the Token is already a defined class, private static final String TAG = "ThumbnailDownloader"; private static final int MESSAGE_DOWNLOAD = 0; private Handler mHandler; // sends the instruction for downloading images, and private Handler mResponseHandler, which processes the instruction for downloading images; // Handler of the autonomous thread, update UIprivate Listener <Token> mListener; private Map <Token, String> requestMap = Collections. synchroni ZedMap (new HashMap <Token, String> (); // stores key-value pairs of ImageView and URL, and is a thread-safe private LruCache <String, Bitmap> mMemoryCache; // cache image class. When the size of the stored image is greater than the value set by LruCache, the system automatically releases the memory public ThumbnailDownloader (Handler handler) {super (TAG); mResponseHandler = handler; // create a HandlerThread named TAG, which is an independent thread with its own logoff // super (TAG) is equivalent to new HandlerThread (TAG) int maxMemory = (int) Runtime. getRuntime (). maxMemory (); // Maximum system running memory int mCacheSize = MaxMemory/8; // memory size allocated to the cache mMemoryCache = new LruCache <String, Bitmap> (mCacheSize) {// This method must be rewritten, to measure the Bitmap size @ Overrideprotected int sizeOf (String key, Bitmap value) {return value. getRowBytes () * value. getHeight () ;};} public interface Listener <Token >{// callback method. void onThumbnailDownloaded (Token token, Bitmap thumbnail) is implemented in the main thread );} public void setListener (Listener <Token> listener) {mListener = listener;} @ Overridepublic void onLooperPrepared () {// Method for running the logoff startup cycle in this thread during the preparation period mHandler = new Handler () {// Handler created in the current thread, only @ Overridepublic void handleMessage (Message message) will be run in the current thread {// process the sent image download Message, download the image and update the UIif (message. what = MESSAGE_DOWNLOAD) {Token token = (Token) message. obj; try {handleRequest (token); // process the message} catch (IOException e) {e. printStackTrace () ;}}};} private void handleRequest (final Token t Oken) throws IOException {final String url = requestMap. get (token); if (url = null) return; byte [] bitmapBytes = new BlogsFetchr (). getUrlBytes (url); // download the final Bitmap bitmap = BitmapFactory image. decodeByteArray (bitmapBytes, 0, bitmapBytes. length); String key = (String) (ImageView) token ). getTag (); Log. I (TAG, "The imageView TAG is:" + key); mMemoryCache. put (key, bitmap); // save it to the cache mResponseHandler. post (new Runnable () {@ Overridepublic void run () {// update UIif (requestMap. get (token )! = Url) return; requestMap. remove (token); mListener. onThumbnailDownloaded (token, bitmap); // update UI});} public void clearQueue () {mHandler. removeMessages (MESSAGE_DOWNLOAD); requestMap. clear ();} public void queueThumbnail (Token token, String url) {// Add the downloaded image command to the "ThumbnailDownloader" message queue, // requestMap is called in PhotoGalleryFragment. put (token, url); Message message = mHandler. obtainMessage (MESSAGE_DOWNLOAD, token); // gets the Message and automatically binds it with mHandler. // parameter 1: what, int type, used to describe the Message // parameter 2: obj, the specified object that is sent with the message. // parameter 3: target, Handler used to process the message. Here, because it is automatically bound with mHandler, the default message is sent. sendToTarget (); // send a message to the target Handler} public Bitmap getCacheImage (String key) {// obtain the image Bitmap bitmap = mMemoryCache in the cache. get (key); return bitmap ;}}


MainActivity:

Package com. example. testcsdn; import java. util. arrayList; import android. app. activity; import android. graphics. bitmap; import android. OS. asyncTask; import android. OS. bundle; import android. OS. handler; import android. util. log; import android. view. view; import android. view. viewGroup; import android. widget. arrayAdapter; import android. widget. imageView; import android. widget. listView; import android. widget. textView; Public class MainActivity extends Activity {private static final String TAG = "MainActivity"; private ListView mListView; private ArrayList <Blog> mBlogs; // Blog list private String testUrl = "http://blog.csdn.net/column.html "; // access link. the homepage of the CSDN blog column tested here is private BlogsFetchr fetchr; // download the html page and the tool object to parse it private MyAdapter adapter; private ThumbnailDownloader <ImageView> mThumbnailDownloader; // image download tool @ Overridepr Otected void onCreate (Bundle savedInstanceState) {super. onCreate (savedInstanceState); setContentView (R. layout. activity_main); fetchr = new BlogsFetchr (); mBlogs = new ArrayList <Blog> (); Log. I (TAG, "mBlogs. size: "+ mBlogs. size (); Blog blog = new Blog (); blog. setBloggerId ("hello"); mBlogs. add (blog); update (testUrl); // the thread that enables response to the download of image messages. The thread is mThumbnailDownloader = new ThumbnailDownloader <ImageView> (new Handler (); mTh UmbnailDownloader. setListener (new ThumbnailDownloader. listener <ImageView> () {@ Overridepublic void onThumbnailDownloaded (ImageView imageView, Bitmap thumbnail) {// update the UI, imageView. setImageBitmap (thumbnail) ;}}); mThumbnailDownloader. start (); mThumbnailDownloader. getlodid (); // It must be after start} private void update (final String testUrl) {new AsyncTask <Void, Void, Void> () {@ Overrideprotected Void doInBackgroun D (Void... params) {mBlogs = fetchr. downloadBlogItems (testUrl); // download the blog list return null;}; @ Overrideprotected void onPostExecute (Void result) {// update ListViewmListView = (ListView) findViewById (R. id. listview_blogcolumn); adapter = new custom cute ();} private class MyAdapter extends ArrayAdapter <Blog> {public MyAdapter (ArrayList <Blog> blogs) {super (MainActivity. This, 0, blogs) ;}@ Overridepublic View getView (int position, View convertView, ViewGroup parent) {if (convertView = null) {convertView = getLayoutInflater (). inflate (R. layout. listview_item, null);} ImageView imageView = (ImageView) convertView. findViewById (R. id. imageView); TextView textView = (TextView) convertView. findViewById (R. id. textView); textView. setText (getItem (position ). getBloggerId (); String ImageUrl = getItem (position ). getBloggerIconUrl (); String imageTag = imageUrl. replaceAll ("[^ \ w]", ""); imageView. setTag (imageTag); // remove non-Chinese characters (letters, numbers, and underscores) from the string // set a tag for the imageView for access to CacheBitmap bitmap = null; if (bitmap = mThumbnailDownloader. getCacheImage (imageTag ))! = Null) {// If an imageView exists in the cache. setImageBitmap (bitmap);} else {// sends the message for downloading the image. queueThumbnail (imageView, imageUrl) ;}return convertView ;}}}

Running effect:



Source code download


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.