Encoding and encoding
First, the development environment I use is Eclipse. The default encoding for creating a Java Project is GBK,
The code below is as follows:
1 import java. io. unsupportedEncodingException; 2 3 public class Demo1 {4 public static void main (String [] args) throws UnsupportedEncodingException {5 String s = "I love ABC"; 6 byte [] bytes1 = s. getBytes ("gbk"); // if no encoding is specified, this String is encoded as a byte sequence using the default Character Set of the platform, and byte [] is returned. 7 // s. getBytes (Charset charset) uses the given charset to encode the String to the byte sequence; 8 // returns a byte [] byte array 9 10 for (byte B: bytes1) {11 System. out. print (Integer. toHexString (B & 0xff) + ""); 12 // Integer. toHexString (int I) returns a string representation of an integer parameter in hexadecimal notation (base 16). 13} 14 // gbk Encoded chinese occupies two bytes, english occupies a byte of 15 System. out. println (); 16 byte [] bytes2 = s. getBytes ("UTF-8"); 17 for (byte B: bytes2) {18 System. out. print (Integer. toHexString (B & 0xff) + ""); 19} 20 // UTF-8 encoded Chinese occupies three bytes, and English occupies one byte, 21 System. out. println (); 22 // java is double byte encoding ---> utf-16be> both Chinese and English occupy two byte 23 byte [] bytes3 = s. getBytes ("utf-16be"); 24 for (byte B: bytes3) {25 System. out. print (Integer. toHexString (B & 0xff) + ""); 26} 27 28/* when your byte sequence is encoded, you want to convert the byte sequence to a string of 29, this encoding method is also required. Otherwise, garbled 30 **/31 System may occur. out. println (); 32 String str1 = new String (bytes3); // use the default project encoding (GBK encoding) -----> bytes3 in the above defined as "utf-16be" encoding, so there will be garbled 33 System. out. println (str1); 34 System. out. println (); 35 36 String str2 = new String (bytes3, "utf-16be"); 37 System. out. println (str2); 38 39/* 40 * text files are byte sequences 41 * can be any encoded byte sequences 42 * If we directly create text files on a Chinese machine, this text file only recognizes ansi Encoding 43*44 */45} 46}
Printed result:
In general, the encoding must correspond, otherwise garbled characters may occur.