Java implementation file encoding monitoring (RPM)

Source: Internet
Author: User
Tags intl

Java implementation of File encoding monitoring

Recently in doing a document translation project, can document the encoding do not know, listen to headache. Tried a lot of ways to finally find out jchardet this tool can easily solve this problem. So make this note in the future to remind yourself and help and need people.

Package com.uujava.mbfy.test;Import Java.io.BufferedInputStream;Import Java.io.File;Import Java.io.FileInputStream;Import java.io.FileNotFoundException;Import java.io.IOException;Import Org.mozilla.intl.chardet.nsDetector;Import Org.mozilla.intl.chardet.nsICharsetDetectionObserver;/********************************************** * Maven * <!--for file encoding check-*<dependency> *<groupid> Net.sourceforge.jchardet</groupid> *<artifactid>jchardet</artifactid> *<version>1.0</ Version> *</dependency> * *********************************************//** * Jchardet Get file Character Set Jchardet * is the Java porting of Mozilla automatic character set detection algorithm code, its official homepage is: * http://jchardet.sourceforge.net/*/PublicClassFilecharsetdetector {PrivateBoolean found =False/** * If a character set detection algorithm is fully matched, this property holds the name of the character set. * Otherwise (such as a binary file) its value is the default value null, then the property should be queried */Private String encoding =NullPublicStaticvoidMain (string[] argv)Throws Exception {System.out.println ("File code:" +New Filecharsetdetector (). Guestfileencoding ("/home/k/documents/test/azmind_7_xh/azmind_7_xh/routing Management. txt"));}/** * Pass in a file object, check the file encoding * *@param file * File Object instance *@return file Encoding, if none, returns NULL *@throws FileNotFoundException *@throws IOException * *Public StringGuestfileencoding (File file)Throws Filenotfoundexception,ioexception {return geestfileencoding (file,New Nsdetector ());}/** * Get the encoding of the file * *@param file * File Object instance *@param languagehint * Language hint area code eg:1: Japanese; 2:chinese; 3:simplified Chinese; * 4:traditional Chinese; 5:korean; 6:dont Know (default) *@return file encoding, eg:utf-8,gbk,gb2312 form, if none, returns NULL *@throws FileNotFoundException *@throws IOException * *Public StringGuestfileencoding (File file,int languagehint)Throws FileNotFoundException, IOException {return geestfileencoding (file,New Nsdetector (Languagehint));}/** * Get the encoding of the file * *@param path * File paths *@return file encoding, eg:utf-8,gbk,gb2312 form, if none, returns NULL *@throws FileNotFoundException *@throws IOException * *Public StringGuestfileencoding (String Path)Throws Filenotfoundexception,ioexception {Return guestfileencoding (New File (path);}/** * Get the encoding of the file * *@param path * File paths *@param languagehint * Language hint area code eg:1: Japanese; 2:chinese; 3:simplified Chinese; * 4:traditional Chinese; 5:korean; 6:dont Know (default) *@return *@throws FileNotFoundException *@throws IOException * *Public StringGuestfileencoding (String Path,int languagehint)Throws FileNotFoundException, IOException {Return guestfileencoding (New File (path), languagehint);}/** * Get the encoding of the file * *@param file *@param det *@return *@throws FileNotFoundException *@throws IOException * *Private StringGeestfileencoding (file file, nsdetector det)Throws FileNotFoundException, IOException {Set an observer ...The Notify () would be called if a matching charset is Found.det.Init (New Nsicharsetdetectionobserver () {PublicvoidNotify (String charset) {found =true;encoding = CharSet;}}); Bufferedinputstream imp =New Bufferedinputstream (New FileInputStream (file));byte[] buf =Newbyte[1024];int Len;Boolean done =FalseBoolean isascii =Truewhile (len = Imp.read (buf,0, buf.length))! =-1) {//Check If the stream is only ASCII. if (isascii) Isascii = Det.isascii (buf, Len); //DoIt if non-ascii and not do yet. if (!isascii &&!done) done = Det. DoIt (buf, Len, false);} Det. Dataend (); if (isascii) {encoding = "ASCII"; found = true;} if (!found) {String prob[] = Det.getprobablecharsets ();  if (Prob.length > 0) {//In the absence of a discovery case, take the first possible encoding encoding = prob[0];} else {return null;}} return encoding;}}                
Http://www.cnblogs.com/mxcy/p/4008342.html

Java implementation file encoding monitoring (RPM)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.