How can I determine the encoding method of a text file?

Source: Internet
Author: User
Tags format definition

Text files usually identify their encoding methods through the first two bytes, but UTF-32 encoding uses the first four bytes to identify their encoding methods. The following are some encoding format identifiers:

Encoding Method First few bytes
ANSI No format definition
Unicode FF fe
Unicode big endian Fe FF
UTF-8 EF bb
UTF-16/UCS-2, little endian Fe FF
UTF-16, UCS-2, big endian FF fe
UTF-32/UCS-4, little endian FF Fe 00 00
UTF-32, UCS-4, big-Endian 00 00 Fe FF

In this way, we writeCodeYou only need to read the first two bytes of the file ~ 4 bytes to know the encoding method. However, in. net, there is another simpler way to know the encoding method of text files and use the following code:

Public encoding getencoding (string file)
{
VaR r = new streamreader (file, true); // true indicatesProgramAutomatic file encoding
Return R. currentencoding; // return code
}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.