Unicode and encoding document collection

Source: Internet
Author: User
Fmddlmyy's csdn blog articles Article It provides a very detailed and easy-to-understand introduction to Unicode and encoding.
Related Vocabulary: encoding, Character Set charset, Code Page codePage, Unicode, GB, GBK, UCS, UTF, BMP, bom.

Http://blog.csdn.net/fmddlmyy/category/279030.aspx

Research on gb18030 encoding and Unicode ing between GBK, gb18030 and Unicode

Whether it is Windows XP or Vista, the default code page corresponding to the Chinese (China) region or GBK. We can only set the region, and cannot set the default code page for the region. In the Windows world, as long as Microsoft does not want to, gb18030 is just a common code page. Currently, simplified Chinese documents are mainly encoded in Unicode and GBK. There should be no documents saved in gb18030. This article is onlyProgramSome research on gb18030 encoding is expected to help readers who are equally curious.Read the full text>

Chinese characters in UNICODE, gb2312, GBK, and gb18030

Number of Chinese Characters in gb18030Read the full text>

Text Encoding and Unicode (II)

In the previous article, we discussed the principle of text display, the Windows code page, and the character set of the Internet. In the next article, we will talk about Unicode. Before that, we should first study a very profound concept: the four-layer model of character encoding...Read the full text>

Discussion on text encoding and Unicode (I)

This article discusses unicode encoding, briefly explains the issues that are not described or described in the terms of UCOS, UTF, BMP, BOM, and other terms, such as code pages and surrogates, and adds Unicode data, next, let's take a look at a unicode tool I recently compiled: unitoy. Although this article is a supplement to the previous article, I try to write it as independent as possible.Read the full text>

About character encoding in Windows

The reason for writing this article is that when we use and install Windows programs, we sometimes see folders named after "2052" and "1033, these numbers seem to be related to character sets, but what do they actually mean? When studying this problem, you may encounter other problems. We will talk about the internal architecture of windows, the/W function of Win32 API, locale, ANSI code page, compilation parameters related to character encoding, MBCS and Unicode programs, resources, and garbled characters, let's go through this simple, entertaining journey.Read the full text>

I will talk about unicode encoding and briefly explain the terminologies such as UCOS, UTF, BMP, and BOM.

At first I just want to explain what is the difference between ucs2 and UTF-16, then...Read the full text>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.