Atitit. The binary data is transmitted without loss to the string network.

Source: Internet
Author: User

Atitit. The binary data is transmitted without loss to the string network.

 

1. Why GBK cannot be used to transmit binary data 1

2. base64 2

3. iso-8859-1 (recommended) 2

4. UTF-8 (unavailable) 2

 

1. Why GBK cannot be used to transmit binary data due to network transmission problems of GBK?

GBK may cause information loss

Some characters cannot be found in the GBK character set, so the encoding 63 is used by default, that is? (Question mark )... GBK is only compatible with low-level ASC encoding (English letters). High-level encoding must be used to encode Chinese characters...

 

 

Author: old wow's paw attilax iron, email: [email protected]

Reprinted please indicate Source: http://blog.csdn.net/attilax

 

 

S = "my AB ha ";

Gbked [-50,-46, 97, 98,-71,-2]

Winhex ced26162b9fe, Ce (206), D2 (210)

Zip [120,-100, 59,119, 41, 49,105,-25, 63, 0, 14, 14, 4, 27]

 

"GBK Str

Zipstr> bytearr [120, 63,119, 41, 49,105, 63, 0, 14, 14, 4, 27]

The total GBK encoding range is 8140-fefe, the first byte is between 81-fe, the last byte is between 40-fe, GBK 81 (129), 40 (64) ------ Fe (254)

 

2. base64

The biggest problem with base64 is that the size is increased by 30%...

3. iso-8859-1 (recommended)

Iso-8859-1 is a good solution..., using it for transcoding is generally no problem.

Iso-8859-1 is the standard character set used for Java Network Transmission

. When we want to convert a "Byte string" into a "string" without knowing which ANSI encoding it is, for the moment, "Every byte" is converted as "one character" without any loss of information.

Iso8859 can perfectly convert characters in the 0-256 range... Will not be lost ..

The encoding range of the ISO-8859-1 character set is 0000-00ff, which corresponds to the encoding range of one byte. This feature ensures that the encoding and decoding using ISO-8859-1 can keep the encoding value "unchanged ".

This feature ensures that the encoding and decoding using ISO-8859-1 can keep the encoding value "unchanged"

4. UTF-8 (unavailable)

S = "my AB ha ";

Utf8 bytes [-26,-120,-111, 97, 98,-27,-109,-120]

Kmprs bytes [120,-100,123,-42, 49, 49,-23,-23,-28, 14, 0, 22, 32, 4,-61]

>>>> Utf8 Str

 

Utf8str2bytes (len27) [120,-17,-65,-67,123,-17,-65,-67, 49, 49,-17,-65,-67, -17,-65,-67,-17,-65,-67, 14, 0, 22, 32, 4,-17,-65,-67]

 

 

5. refer:

Character encoding notes: ASCII, Unicode and UTF-8-Ruan Yifeng network logs .htm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.