java_web___ string transcoding string.getbytes () and new string ()--(GO)

Source: Internet
Author: User

Reprinted from: http://zhuhuide2004.iteye.com/blog/562739; reprint please indicate the original author's address;

In Java, the String.getbytes (string decode) method returns a byte array representation of a string under the encoding according to the specified decode encoding, as

    

byte [] B_GBK = "Medium". GetBytes ("GBK"); byte [] B_utf8 = "Medium". GetBytes ("UTF-8"); byte [] b_iso88591 = "Medium". GetBytes ("iso8859-1");

The byte array in the GBK, UTF-8, and Iso8859-1 encodings is returned for the character "medium" respectively, at which time the length of the B_GBK is 2,b_utf8 and the length of 3,b_iso88591 is 1.

In contrast to GetBytes, the "medium" Word can be restored by means of the new string (byte[], decode), and the new string (byte[], decode) is actually using the encoding specified by decode to byte[] parsed into a string.

New String (B_GBK, "GBK"new string (B_utf8, "UTF-8"new string (b_iso88591, "iso8859-1" );

By printing S_GBK, S_utf8 and s_iso88591, you will find that S_GBK and S_utf8 are "medium", and only s_iso88591 is an unrecognized character, why can't I restore the word "medium" after using ISO8859-1 encoding and then combining it? In fact, the reason is very simple, because iso8859-1 encoded in the encoding table, there is no Chinese characters, of course, can not pass the "medium". GetBytes ("Iso8859-1"), to get the correct "medium" in the iso8859-1 of the encoded value, so again through the new String () to restore it is impossible to talk about.

Therefore, when using the String.getbytes (String decode) method to get byte[], it is important to make sure that the code value of the string representation exists in the Decode encoding table, so that the resulting byte[] array can be correctly restored.

Sometimes, in order for Chinese characters to accommodate certain special requirements (such as HTTP header headers requiring their content to be iso8859-1 encoded), it is possible to encode Chinese characters in bytes, such as

New String ("Medium". GetBytes ("UTF-8"), "iso8859-1"),

The resulting s_iso8859-1 string is actually three characters in the iso8859-1, after passing these characters to the destination, the destination program then passes the opposite way to string S_utf8 = new String (S_iso88591.getbytes (" Iso8859-1 ")," UTF-8 ") to get the correct Chinese kanji" medium ". This guarantees both compliance with the Agreement and the support of Chinese.

java_web___ string transcoding string.getbytes () and new string ()--(GO)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.