This is a program that can get the correct original text from the garbled text, based on the principle that the wrong encoding often leads to bit replenishment, so the correct text uses a minimum number of bytes (one).
Copy Code code as follows:
Package com.hongyuan.test;
Import java.io.UnsupportedEncodingException;
/*
* This is a code from garbled text to get the correct original text of the program, which is based on the principle that the wrong coding often leads to a bit of supplemental,
* So the correct text should have the least number of bytes (one).
*
* If you are not able to get the correct text when you test this program, the possible causes are as follows:
* 1. This program can only get the original text from one error encoded text and cannot recover the text from multiple error encodings.
* 2. Sometimes the wrong encoding causes some characters to become invisible, and you may not copy all the garbled text, leading to a bit deletion. In this case, the text cannot be recovered.
* 3. The original text is a relatively large character set, the wrong encoding uses a small character set, those characters outside the small character set are lost and cannot parse the correct text from it.
* 4. Congratulations on your winning, there are some characters that use any kind of encoding no different or wrong coding does not lead to a bit supplement, then I can do nothing. (This is really rare)
*
* Note: The program garbled text is Baidu Home (utf-8) adjusted for GBK (obviously garbled) get, interested comrades can use other garbled test. There are questions welcome reply.
*/
public class Charsettest {
public static final string[] Charset_names=new string[]{"iso8859-1", "GBK", "UTF-8"};
public static void Main (string[] args) throws Unsupportedencodingexception {
Garbled string
String str= "Atlas 蒋鐧惧 harm 鍏 帹 Windows XP 鑱 斿 悎 闃 fork 姢 nowshera e 喅 file";
int strlength=integer.max_value; Character length
String newstr= ""; A string parsed from a garbled string
String srccharset= ""; Current garbled string encoding
String targetcharset= ""; Correct encoding of garbled string
Traversing possible combinations of encodings, resulting in a coded format with the smallest encoding length
for (int i=0;i<charset_names.length;i++) {
for (int j=0;j<charset_names.length;j++) {
String Temp=new string (Str.getbytes (Charset_names[i]), charset_names[j]);
SYSTEM.OUT.PRINTLN (temp);
if (Temp.length () <=strlength) {
Strlength=temp.length ();
Newstr=temp;
Srccharset=charset_names[i];
TARGETCHARSET=CHARSET_NAMES[J];
}
}
}
Output query to the encoding and correct text format
System.out.println (srccharset+ "-->" +targetcharset+ ":" +newstr);
}
}