Java Implementation Web version RSS reader (d) Customizing your own RSS parsing library myrsslib4j

Source: Internet
Author: User
Tags add time modify response code client


In the last blog post "web version RSS reader (iii)--parsing online RSS subscriptions," has been mentioned in the problem, here in detail.



When parsing a subscription in an RSS format, the main problem encountered is that the "Server returned HTTP response code:403 for url:http://xxxxxx" error, Baidu will know, This is a common error in Web site access, the server understands the customer's request, but refuses to process it. That is, Access denied! Then check the data, that some servers (such as CSDN blog) to deny Java as a client access to it, so when parsing, will throw an exception.



Do not allow access to do, do not be afraid, we have policies, under the countermeasures. The server is accessed by setting up User-agent to spoof the server.



Connection.setrequestproperty ("User-agent", "mozilla/4.0" (compatible; MSIE 5.0; Windows NT;    Digext) "); Use UA camouflage to access connection objects



But after a long time, found that only modify Rsslib4j.jar to the connection object to set UA. Have to find the source code modified, n long after, in Google code to hunt an open source project newrsslib4j, it is based on the RSSLIB4J modified, the project Open Source homepage: http://code.google.com/p/newrsslib4j/. Downloaded with joy, it turns out that there are still 403 of problems. A hard-hearted, oneself to do a rsslib, and then checkout the source of newrsslib4j, their own hands to change.



1. Modify the 403 Forbidden problem.



Modify the Setxmlresource () method of the Rssparser class in the Org.gnu.stealthp.rsslib package, and add UA to the URLConnection object.


/ **
  * Set rss resource by URL
  * @param ur the remote url
  * @throws RSSException
  * /
public void setXmlResource (URL ur) throws RSSException {
   try {
          
     URLConnection con = u.openConnection ();
          
     // -----------------------------
     // Add time: 2013-08-14 21:00:17
     // Person: @ 龙 轩
     // Blog: http://blog.csdn.net/xiaoxian8023
     // Add content: Since the server blocks java as the client to access rss, set User-Agent
     con.setRequestProperty ("User-Agent", "Mozilla / 4.0 (compatible; MSIE 5.0; Windows NT; DigExt)");
     // -----------------------------
      
     con.setReadTimeout (10000);
  String charset = Charset.guess (ur);
  is = new InputSource (new UnicodeReader (con.getInputStream (), charset));
     if (con.getContentLength () == -1 && is == null) {
       this.fixZeroLength ();
     }
   } catch (IOException e) {
     throw new RSSException ("RSSParser :: setXmlResource fails:" + e.getMessage ());
   }
} 


Modify the Guess () method of the CharSet class in the Org.mozilla.intl.chardet package, comment out the original InputStream object, create the URLConnection, set the User-agent, Create InputStream by URLConnection object:


Judge from URL public    
static String guess (URL url) throws IOException {/  
          
    /-----------------------------  
    / /modified: 2013-08-14 21:00:17  
    //Staff: @ Longxuan  
    //Blog: http://blog.csdn.net/xiaoxian8023  
    //modify content: Comment InputStream, Create URLConnection, set user-agent, create InputStream  
          
    //inputstream in = Url.openstream () by URLConnection object;  
    URLConnection con = url.openconnection ();  
    Con.setrequestproperty ("User-agent", "mozilla/4.0" (compatible; MSIE 5.0; Windows NT; Digext) ");  
    InputStream in = Con.getinputstream ();  
    -----------------------------return  
      
    guess (in);  
}


More Wonderful content: http://www.bianceng.cn/Programming/Java/





Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.