Deep analysis of Java Web Chinese coding Problems (one) __js

Source: Internet
Author: User
Tags java web

One, why to encode.

In order for the computer to understand our language, we assume that the computer can understand the language is English, other languages to be able to use in the computer must be translated, translated into English, the process of translation is encoded, so it can be imagined, as long as not English-speaking countries to use the computer must be encoded.

In general, the reason for encoding can be summed up as:

The smallest unit of information stored in a computer is a byte, or 8 bit, so the range of characters that can be represented is 0~255.

There are too many symbols for human beings to represent completely in one byte.

second, how to translate

Understand a variety of languages need to communicate, after translation is necessary, it is imperative, then how to translate it.

The computer provides a variety of translation methods, common ascii,iso-8859-1,gb2312,gbk,utf-8,utf-16 and so on, they can be seen as a dictionary, they specify the rules of transformation, in accordance with this rule can let the computer correctly represent our characters. Use that encoding to store it. This will take into account the importance of storage space or the efficiency of coding (three common).

1. ASCII code

We all know that ASCII code, a total of 128, with a byte of low 7-bit, 0-31 is the control character Furu space, enter, delete 32-126 is the print characters, can be entered through the keyboard and can be displayed.


Figure 1-ascii Code

2, UTF-16

Speaking of UTF must mention Unicode,iso trying to create a new hyper-language dictionary, all the languages in the world can be translated through this dictionary, it is conceivable how complex the dictionary is, the detailed specification of Unicode can refer to the appropriate documentation. Unicode is the foundation of Java and XML, and that UTF-16 specifically defines how Unicode characters are accessed in a computer.

UTF-16 uses two bytes to represent the Unicode conversion format, which is a fixed-length representation, which can be expressed in two bytes, two bytes 16bit, so called UTF-16. UTF-16 represents a very convenient character, each byte representing a character, which greatly simplifies string manipulation, which is also a very important reason for Java to UTF-16 as the character storage format for memory.


3, UTF-8

UTF-16 Unified use two bytes to represent a character, although the representation is simple and convenient, but there are disadvantages, there are a large number of characters in a byte can be expressed now but two bytes, storage space magnification, in the current network bandwidth is very limited, this will increase the network traffic. UTF-8 employs a variable-length technique in which each coding region has a different codewords length, and different types of characters can be composed of 1-6 bytes.




Figure 2-unicode Code


The next article will introduce the scenarios that need coding in-java

The article draws on "In-depth analysis of Java Web Insider"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.