Refer to this article: http://blog.csdn.net/maxracer/article/details/6075057
Test code:
@Test public void Testbytes () {//Bytes//Chinese: iso:1 gbk:2 utf-8:3//digit or letter: iso:1 gbk:1 utf-8:1 String Usernam
E = "Medium";
try {//Get the specified encoded byte array string---> Byte array byte[] U_iso=username.getbytes ("iso8859-1");
Byte[] U_gbk=username.getbytes ("GBK");
Byte[] U_utf8=username.getbytes ("Utf-8");
System.out.println (u_iso.length);
System.out.println (u_gbk.length);
System.out.println (u_utf8.length);
The byte array----> String un_iso=new strings (U_iso, "iso8859-1") is exactly the inverse of the above.
String Un_gbk=new string (U_GBK, "GBK");
String Un_utf8=new string (U_utf8, "utf-8");
System.out.println (Un_iso);
System.out.println (UN_GBK);
System.out.println (Un_utf8);
Sometimes it must be an ISO character encoding type, which is handled as follows string Un_utf8_iso=new string (U_utf8, "iso8859-1");
The ISO-encoded string is restored as String Un_iso_utf8=new string (un_utf8_iso.getbytes ("iso8859-1"), "UTF-8");
System.out.println (Un_iso_utf8); } catch (Unsupportedencodingexception e) {//TODO Auto-geNerated Catch block E.printstacktrace (); }
}
Test results:
1
2
3
?
In
In
ĸ
In
From the reproduced article excerpt:
Garbled reason: Why use iso8859-1 encoding and then combination, can not restore the word "medium", in fact, the reason is very simple, because iso8859-1 encoded in the encoding table, there is no Chinese characters, of course, can not pass the "medium". GetBytes ("iso8859-1"); To get the correct "medium" in the iso8859-1 of the encoded value, so again through the new String () to restore it is impossible to talk about.
Sometimes, in order for Chinese characters to accommodate certain special requirements (such as HTTP header headers requiring their content to be iso8859-1 encoded), it is possible to encode Chinese characters in bytes, such as:
String s_iso88591 = new String ("Medium". GetBytes ("UTF-8"), "iso8859-1"), so that the resulting s_iso8859-1 string is actually three characters in Iso8859-1, After these characters are passed to the destination, the destination program passes the reverse way of string S_utf8 = new String (S_iso88591.getbytes ("iso8859-1"), "UTF-8") to get the correct Chinese kanji "medium". This guarantees both compliance with the Agreement and the support of Chinese.