In memory, there was no concept of character, and each byte was a 0-255 number. We number numbers and each number represents one character.
traced to the ASCII code as an example. For example, 65 (decimal) is defined as the character ' A ', 66 is defined as ' B ', and 61 is defined as ' = '.
that's the code. Then there is the char definition, which is actually equivalent to byte, but it is only when you use the char that you know the character is to be represented.
if Char ch=65, then printf ("%c", ch), it comes out as ' A ', and that's the decoding. If printf ("%d", ch), it is still 65.
Single-byte byte can only support 0-255 of the encoding, for Asian languages, such as Chinese characters, is completely inadequate. So then there are two-byte (0-65535) and multibyte-coded occurrences.
in the case of Double-byte, 0-65535 cannot be encoded for multiple languages at the same time. So for example, the same number may be defined as a ' king ' in Chinese GBK encoding, and it is defined as ' small ' in Japanese-encoded MS932.
when the number in memory is interpreted as a character (literal and symbolic), it is the decoding process, which is encoded when the text and symbol are defined numerically.
in the actual coding process, many of the bytes are related to the definition, not as simple as I said.
in the network transmission, is actually still a single-byte, or even byte of 1 bit a bit transmission.
we use the byte array in the transmission, that is the basic type, and before and after the transfer back to GBK Chinese code.
from the point of view of the old beauty, from byte to Asian language (Gbk/unicode) belongs to the encoding encoding, and the reverse process is decoding decoding.
In addition: the
Str.getbyte () method encodes this String as a byte sequence using the platform's default character set (for example, native is GBK) and stores the result in a new byte array.