>
Unicode is commonly used in the UCS-2, it uses two bytes to encode a character, such as the Chinese character "warp" encoding is 0X7ECF, 0X7ECF converted to decimal is 32463,ucs-2 with two bytes to encode characters, 2 16 is equal to 65536, so ucs- 2 can encode a maximum of 65,536 characters. Encoding from 0 to 127 characters like ASCII-encoded characters, such as the letter "a" Unicode encoding is 0x0061, decimal is 97, and "a" ASCII encoding is 0x61, decimal is 97, for the encoding of Chinese characters, in fact, Unicode support for Chinese characters does not How good, simplified and traditional a total of sixty thousand or seventy thousand Chinese characters, and UCS-2 can represent up to 65,536, only 60,000 more, so Unicode can only exclude some of the almost unused Chinese characters, fortunately, the commonly used Simplified Chinese characters are more than 7,000, in order to be able to represent all Chinese characters, Unicode also has UCS-4 specifications, is to encode characters in 4 bytes, but it is now generally used in UCS-2, only two bytes to encode.
The range of Unicode encoding for Chinese characters is: 0x4e00--0x9fa5.
Unicode encoding Table (0X0000--0X0FFF):
legend: |
Unicode 3.1 |
Unicode 1.0 |
Unicode 3.2 |
Unicode 1.1 |
Unicode 4.0 |
Unicode 2.0 |
Unicode 4.1 |
Unicode 2.1 |
Not used |
Unicode 3.0 |
No coding |
|
Unicode Encoding Table |
0000-0fff |
8000-8fff |
10000-10fff |
20000-20fff |
28000-28fff |
|
|
|
1000-1fff |
9000-9fff |
|
21000-21fff |
29000-29fff |
|
|
|
2000-2fff |
A000-afff |
|
22000-22fff |
2a000-2afff |
|
|
|
3000-3fff |
B000-bfff |
|
23000-23fff |
|
|
|
|
4000-4fff |
C000-cfff |
1d000-1dfff |
24000-24fff |
2f000-2ffff |
|
|
|
5000-5fff |
D000-dfff |
|
25000-25fff |
|
|
|
|
6000-6fff |
E000-efff |
|
26000-26fff |
|
|
|
|
7000-7fff |
F000-ffff |
|
27000-27fff |
E0000-e0fff |
|
|