UTF-8 no BOM and + BOM

Source: Internet
Author: User
UTF-8 encoded files can be divided into two formats: No BOM and BOM. What is Bom?

"Ef bb bf" these three bytes are called Bom. The full name of BOM is "byte order Mard ". bom is often used in UTF-8 files to indicate that the file is a UTF-8 file, while Bom is actually used in UTF16 to indicate the high and low byte sequences. Prior to the byte stream, BOM indicates that the low byte sequence is used (the low byte is at the front), while utf8 does not need to consider the byte sequence, so it is possible to have Bom. UTF-8 is encoded in bytes, so there is no bytecode problem. The UTF-16 uses two bytes as the encoding unit. before interpreting a UTF-16 text, you must first understand the byte order of each encoding unit. For example, if the Unicode encoding of "queue" is 594e and that of "B" is 4e59. If we receive the UTF-16 byte stream "594e", is this "Kui" or "B "?

If you select BOM when saving the file, the headers already sent may occur.
Because the web server software may not understand Bom, the two extraordinary bytes of BOM are sent to the browser as characters.
Then, the headers already sent problem occurs when you call functions such as session_start.
Therefore, the most fundamental way to solve this problem is not to use BOM when saving the utf8 encoded file.

Microsoft's notepad word and so on can only correctly open the utf8 file containing Bom, and then ultraedit is exactly the opposite, and the bomutf8 file is mistakenly considered ASCII code.

The BOM of UTF-8 is efbbbf, because the UE loads the UTF-8 file into UTF16, and the above efbbbf is fffe (BOM of Unicode-Le) in UTF16 ),

Ultraedit does not know about BOM and adds another Bom, so there are two fffe. The file is damaged.


When the applicationProgramWhen using UTF-8 encoding, you must pay attention to the BOM issue when saving the file.

How can we convert utf8 without BOM to utf8?


Using (Textreader Input =   New Streamreader ( New Filestream ( @" C: \ test. Properties " , Filemode. Open), encoding. utf8) )
{
Using(Textwriter output= NewStreamwriter ( NewFilestream (@"C: \ test2.lmx", Filemode. Create), encoding. utf8) )
{
Int Buffersize =   8096 ;
Char [] Buffer =   New   Char [I];
Int Len;

While (Len = Input. Read (buffer, 0 , I )) >   0 )
{
Output. Write (buffer,0, Len );
}

Input. Close ();
}
}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.