Atitit. The binary data is transmitted without loss to the string network.
1. Why GBK cannot be used to transmit binary data 1
2. base64 2
3. iso-8859-1 (recommended) 2
4. UTF-8 (unavailable) 2
1. Why GBK cannot be used to transmit binary data due to network transmission problems of GBK?
GBK may cause information loss
Some characters cannot be found in the GBK character set, so the encoding 63 is used by default, that is? (Question mark )... GBK is only compatible with low-level ASC encoding (English letters). High-level encoding must be used to encode Chinese characters...
Author: old wow's paw attilax iron, email: [email protected]
Reprinted please indicate Source: http://blog.csdn.net/attilax
S = "my AB ha ";
Gbked [-50,-46, 97, 98,-71,-2]
Winhex ced26162b9fe, Ce (206), D2 (210)
Zip [120,-100, 59,119, 41, 49,105,-25, 63, 0, 14, 14, 4, 27]
"GBK Str
Zipstr> bytearr [120, 63,119, 41, 49,105, 63, 0, 14, 14, 4, 27]
The total GBK encoding range is 8140-fefe, the first byte is between 81-fe, the last byte is between 40-fe, GBK 81 (129), 40 (64) ------ Fe (254)
2. base64
The biggest problem with base64 is that the size is increased by 30%...
3. iso-8859-1 (recommended)
Iso-8859-1 is a good solution..., using it for transcoding is generally no problem.
Iso-8859-1 is the standard character set used for Java Network Transmission
. When we want to convert a "Byte string" into a "string" without knowing which ANSI encoding it is, for the moment, "Every byte" is converted as "one character" without any loss of information.
Iso8859 can perfectly convert characters in the 0-256 range... Will not be lost ..
The encoding range of the ISO-8859-1 character set is 0000-00ff, which corresponds to the encoding range of one byte. This feature ensures that the encoding and decoding using ISO-8859-1 can keep the encoding value "unchanged ".
This feature ensures that the encoding and decoding using ISO-8859-1 can keep the encoding value "unchanged"
4. UTF-8 (unavailable)
S = "my AB ha ";
Utf8 bytes [-26,-120,-111, 97, 98,-27,-109,-120]
Kmprs bytes [120,-100,123,-42, 49, 49,-23,-23,-28, 14, 0, 22, 32, 4,-61]
>>>> Utf8 Str
Utf8str2bytes (len27) [120,-17,-65,-67,123,-17,-65,-67, 49, 49,-17,-65,-67, -17,-65,-67,-17,-65,-67, 14, 0, 22, 32, 4,-17,-65,-67]
5. refer:
Character encoding notes: ASCII, Unicode and UTF-8-Ruan Yifeng network logs .htm