Cpdetector Code Identification

Source: Internet
Author: User

Overview

When the browser opens a Web page, the first task is to determine the encoding format of the Web page, and then use the appropriate encoding for parsing; our commonly used text editors also need to determine the encoding of the document to parse when opening the document. This is related to the technology is coding screening, below we introduce a more useful Java library.

At http://sourceforge.net/projects/cpdetector/this address can be downloaded to.

Instance

Do not do too much to repeat, directly paste out the instance code.

 PackageCom.coder4j.main.cpdetector;ImportInfo.monitorenter.cpdetector.io.ASCIIDetector;ImportInfo.monitorenter.cpdetector.io.ByteOrderMarkDetector;ImportInfo.monitorenter.cpdetector.io.CodepageDetectorProxy;ImportInfo.monitorenter.cpdetector.io.JChardetFacade;ImportInfo.monitorenter.cpdetector.io.ParsingDetector;ImportInfo.monitorenter.cpdetector.io.UnicodeDetector;Importjava.net.MalformedURLException;ImportJava.net.URL;/*** Import the following jar<br> * cpdetector_1.0.10.jar,antlr-2.7.4.jar,chardet-1.0.jar* *@authorChinaxiang * @date 2015-10-11**/ Public classUsecpdetector {/*** Get the encoding of the URL * *@paramURL *@return*/  Public StaticString geturlencode (url url) {/** Detector is a detector that gives the detection task to a specific instance of the probe implementation class. * Cpdetector contains a number of commonly used probe implementation classes, which can be added through the Add method, such as Parsingdetector, * jchardetfacade, Asciidetector, Unicodedetector. * Detector returns the detected * Character set encoding in accordance with the "who first returns non-null probe results, whichever is the result". Use three third-party jar packages: Antlr.jar, Chardet.jar, and Cpdetector.jar * Cpdetector are based on statistical principles and are not guaranteed to be completely correct. */Codepagedetectorproxy Detector=codepagedetectorproxy.getinstance ();/** Parsingdetector can be used to check the encoding of HTML, XML and other files or character streams, and the parameters in the construction method are used to indicate whether the details of the probing process are displayed, and false is not displayed. */Detector.add (NewParsingdetector (false)); Detector.add (Newbyteordermarkdetector ());/** The Jchardetfacade encapsulates the jchardet provided by the Mozilla organization, which can be used to encode and measure most files. Therefore, generally with this detector can meet the requirements of most projects, if you are not at ease, you can * add a few more detectors, such as the following asciidetector, Unicodedetector and so on. * * used in Antlr.jar, Chardet.jar*/Detector.add (Jchardetfacade.getinstance ());//Asciidetector for ASCII code determinationDetector.add (Asciidetector.getinstance ());//Unicodedetector for the determination of Unicode family codesDetector.add (Unicodedetector.getinstance ()); Java.nio.charset.Charset CharSet=NULL; Try{CharSet=detector.detectcodepage (URL);} Catch(Exception ex) {ex.printstacktrace ();}if(CharSet! =NULL) { returncharset.name ();} return NULL;}  Public Static voidMain (string[] args) {Try{URL URL=NewURL ("http://www.baidu.com"); String encode=geturlencode (URL); System.out.println (encode);//UTF-8}Catch(malformedurlexception e) {e.printstacktrace (); }}}
View Code

The path to the file can also be converted to a URL, so you should be able to determine the file encoding.

Cpdetector Code Identification

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.