Original: http://www.open-open.com/code/view/1420514359234
Often encountered the user upload part of the data text file garbled problem, and can not limit the user's upload file encoding format (this may be higher than the customer requirements), had to find a way. Find a part of Java to get the file encoded.
It's either a recognition error. Or just a small piece of code, not to mention what the specific reference ... I'm just going to share it here. The tool class is a method. The main test method I will not write.
I can't seem to upload attachments ... Just get my resources.
This is a reference to the two jar classes.
Chardet.jar
Cpdetector_1.0.10.jar
PackageCom.dxx.buscredit.common.util; ImportInfo.monitorenter.cpdetector.io.ASCIIDetector; ImportInfo.monitorenter.cpdetector.io.CodepageDetectorProxy; ImportInfo.monitorenter.cpdetector.io.JChardetFacade; ImportInfo.monitorenter.cpdetector.io.ParsingDetector; ImportInfo.monitorenter.cpdetector.io.UnicodeDetector; ImportJava.io.File; ImportJava.nio.charset.Charset; Public classFilecharsetdetector {/*** Use third party open source package Cpdetector to get the file encoding format. * @paramFilePath *@return */ Public StaticString getfileencode (file file) {/*** <pre> * 1, Cpdetector contains a number of commonly used detection implementation classes, these probe implementation class instances can be added through the Add method, * such as: PARSINGD Etector, Jchardetfacade, Asciidetector, Unicodedetector. * 2, detector in accordance with the "who first return non-empty detection results, based on this result" principle. * 3, Cpdetector is based on the principle of statistics, is not guaranteed to be completely correct. * </pre>*/Codepagedetectorproxy Detector=codepagedetectorproxy.getinstance (); Detector.add (NewParsingdetector (false)); Detector.add (Unicodedetector.getinstance ()); Detector.add (Jchardetfacade.getinstance ());//the Chardet.jar class is referenced internallyDetector.add (Asciidetector.getinstance ()); Charset Charset=NULL; Try{CharSet=Detector.detectcodepage (File.touri (). Tourl ()); } Catch(Exception e) {e.printstacktrace (); } //default is GBKString charsetname = "GBK"; if(CharSet! =NULL) { if(Charset.name (). Equals ("Us-ascii") ) {CharsetName= "Iso_8859_1"; } Else{CharsetName=Charset.name (); } } returnCharsetName; } }
Java automatically recognizes user-uploaded text file encodings