Text files usually identify their encoding methods through the first two bytes, but UTF-32 encoding uses the first four bytes to identify their encoding methods. The following are some encoding format identifiers:
| Encoding Method |
First few bytes |
| ANSI |
No format definition |
| Unicode |
FF fe |
| Unicode big endian |
Fe FF |
| UTF-8 |
EF bb |
| UTF-16/UCS-2, little endian |
Fe FF |
| UTF-16, UCS-2, big endian |
FF fe |
| UTF-32/UCS-4, little endian |
FF Fe 00 00 |
| UTF-32, UCS-4, big-Endian |
00 00 Fe FF |
In this way, we writeCodeYou only need to read the first two bytes of the file ~ 4 bytes to know the encoding method. However, in. net, there is another simpler way to know the encoding method of text files and use the following code:
Public encoding getencoding (string file)
{
VaR r = new streamreader (file, true); // true indicatesProgramAutomatic file encoding
Return R. currentencoding; // return code
}