Document directory
- How are characters stored in memory?
How are characters stored in memory?
Single-byte string: each character occupies one byte and is stored in sequence, ending with 0 represented by a single byte. For example. The storage format of "Bob" is as follows:
Unicode storage format, L "Bob"
42 00 |
6F 00 |
62 00 |
00 00 |
B |
O |
B |
BOS |
Use 0x0000 in two bytes as the end mark.
At a glance, the DBCS string is very similar to the SBCS string, but we will see the nuances of the DBCS string in a moment, which will produce unexpected results when traversing a string using string operation functions and permanent character pointers. The storage format of string "Japanese" ("nihongo") in memory is as follows (LB and TB are used to represent leading byte and trail byte respectively)
93 FA |
96 7B |
8C EA |
00 |
LB TB |
LB TB |
LB TB |
EOS |
Day |
Ben |
Language |
EOS |
It is worth noting that the "ni" value cannot be interpreted as the WORD value 0xfa93, but should be considered as two values 93 and fa are encoded as "ni" in this order. (So on a big-endian CPU, the bytes wocould still be in the order shown abve .)