In C #, how does string represent in memory,
I wonder if you have had the same question as me. How do strings of different encoding be stored in the runtime memory? When a computer operates a string-type object, how do I know the encoding of this string? What is BOM similar to a text file in a string object?
The answer is that the memory is not encoded. Use UCS2 in a unified manner (Note: here is not to mention UTF16, see below) encoding (the size end should be related to the computer CPU, intel should be small end) stored in the memory.
When the string object interacts with IO, the bytes from IO are processed based on the Encoding in the method, or converted to the byte stream indicated by Encoding As IO output.
In addition, the above mentioned memory uses UCS2 instead of UTF16, which means that C # is the same as java for Unicode encoding with a value greater than 0xFFFF, is converted to a "proxy pair" (2*2 bytes. Therefore, if the string contains "big" characters like emoji, the Length returned by the string Length method is incorrect. The solution is to use LengthInTextElements in the StringInfo class.
PS: Unicode and BigEndianUnicode in System. Text. Encoding are UTF16, which Microsoft must have. But I don't know.