Solve garbled problems with the Java String Class GetBytes (String CharsetName) and string (byte[] bytes, string charsetname)

Source: Internet
Author: User

How the data of string in Java is stored, the source code can be seen that the string data is stored in a member variable such as char[] value , the size of the char type is 2 bytes in Java
We also know that the Unicode version that is commonly used now is UCS-2, which is to use 2 bytes to represent the Unicode version of a character, which is right, Java is using the UCS-2 standard, so the value in string is stored in a number

For example, ' You ' Unicode encoding is 4F60, see the test code below

char c = ‘你‘;System.out.println(Integer.toHexString(c));System.out.println(Integer.valueOf(c));System.out.println(c);

The result is:
4f60
20320
You

So, now we know that the inside string is actually stored without any encoded Unicode encoding, that is, the corresponding character encoding, and then look at our two methods:

GetBytes (CharsetName)
It means to get a byte array based on this code.
What does that mean?
That is, converting the in-memory Unicode encoding to a byte array corresponding to the CharsetName format
Like ' You ', the conversion to Utf-8 is three words, so the resulting byte array is three bytes
i.e. [E4 BD A0]

And then string (bytes,charsetname)?

It means to bytes this byte array in accordance with CharsetName, and assemble it as a string to save it.
For example, the above byte array [e4 BD A0], according to Utf-8 interpretation, stored is "You" this string, if interpreted according to other codes, will not be interpreted as "you"

Say something else, why it's usually necessary to manipulate the parameters in the servlet to control the encoding:

String str = new String (Param.getbytes ("iso-8859-1"), "UTF-8");

In fact, this is very good understanding, the browser passed the byte data is UTF-8 encoded, and then the Web container default this byte data is iso-8859-1 encoded, so using iso-8859-1 to convert this byte data into a string storage, equivalent to do the following:

string s = new string (utf8bytes, "iso-8859-1");

Note that this code is single-byte, that is, each byte is converted to Unicode encoding, fortunately, so that we have the opportunity to convert the string to sing Woo the same byte array, so that we usually use the most of the code of the coding process

Finally, to say again, the reason for not understanding the code is that we understand the error, we must know:

Unicode encoding used by the Java internal storage string
We usually hear someone say, "I need to convert string from iso-8859-1 to GBK code", what's going on? In fact, we are not going to "convert a string encoded by iso-8859-1 into a GBK encoded string", and it is repeatedly stated that the string in Java is Unicode encoded, so there is no "iso- 8859-1 encoded string "or" GBK encoded string "is said. The only reason for the conversion is that the string was incorrectly encoded. we often encounter the need to convert from iso-8859-1 to such things as gbk/utf-8 and so on. The so-called conversion process is:string–> byte[]–>string

Solve garbled problems with the Java String Class GetBytes (String CharsetName) and string (byte[] bytes, string charsetname)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.