"Character set and character encoding" UTF-8, UTF-16, and UTF-32

Source: Internet
Author: User

UTF-32 stores each character in 4 bytes to ensure that the UCS is fully represented. However, the number of characters in the UCS does not need to be represented by 32 bits at all, UTF-32 greatly wasted space. In addition, because of the combination of characters, the fixed length is not as fast as expected to locate characters, anyway, is super bad.

UTF-16 maps the UCS to a 16-long integer for data storage or transport. The code position of the UCS requires 1 or 2 16-bit codes to represent, so this is a variable-length representation. In addition, UTF-16 also needs to specify the byte order. The string representations in Java and C # are UTF-16 encoded so that their char type is 16 bits with the short type, and a byte type is added to represent the 8-bit byte .

UTF-8 is also a variable-length character encoding, which is a prefix code, the character of the prefix code is that the encoding system of any one of the legitimate code will not be the prefix of another code, so UTF-8 do not need to specify the byte order. A UTF-8 encoding can be expressed in 1~6 bytes, set the first bits of the initial byte to one to specify that the character occupies a few bits, such as a two-byte character encoding, the first bit is 110xxxxx, the second bit is 10xxxxxx, The encoding for a six-byte character is this: 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx, so UTF-8 can encode up to 231 characters.

"Character set and character encoding" UTF-8, UTF-16, and UTF-32

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.