In the Java language, the number of bytes in Chinese characters depends on how the character is encoded, and in general, when iso8859-1 encoding is used, a Chinese character is only 1 bytes as an English character, and a Chinese character is 2 bytes when using GB2312 or GBK encoding. , while using UTF-8 encoding, a Chinese character will account for 3 bytes. We can get the byte array by the String class GetBytes (String CharsetName) method, which is encoded with the specified encoding, and the length of the byte array is the number of bytes that the string takes up in the specified encoding mode.
"Test Sample"
Public classTest { Public Static voidMain (String []args)throwsunsupportedencodingexception {//Run Result: 2SYSTEM.OUT.PRINTLN ("Test". GetBytes ("Iso8859-1")). length); //Run Result: 4SYSTEM.OUT.PRINTLN ("Test". GetBytes ("GB2312")). length); //Run Result: 4SYSTEM.OUT.PRINTLN ("Test". GetBytes ("GBK")). length); //Run Result: 6SYSTEM.OUT.PRINTLN ("Test". GetBytes ("UTF-8")). length); }}
Note the GetBytes () method with no parameters for the String class is converted in the default encoding of the platform that the program runs on, with different results under different platforms, so the GetBytes (String CharsetName) method with the specified encoding is recommended.
The number of bytes in Java that are literal characters