Very detailed very good, turn around to study:Reprinted from: http://www.cnblogs.com/lidabo/archive/2013/11/27/3446518.html1, character encoding, internal code, incidentally introduced Chinese character codingCharacters must be encoded before they
Basics of UTF-8 Character Set
Brief character set history
Among all character sets, the most well-known number is the 7-bit ASCII character set. It is short for American Standards Committee for information interchange. It is designed for American
1. Prerequisites1. character: the minimum unit of abstract text. It has no fixed shape (may be a font shape) and has no value. "A" is a character, and "€" (a symbol of the currency used by Germany, France, and many other European countries) is also
Google's Sitemap service requires that all site maps published must be encoded in Unicode UTF-8. Google does not even allow other Unicode encodings (such as UTF-16), not to mention non-Unicode encodings such as ISO-8859-1. Technically, this means
Web applications must meet the needs of multiple languages. Users in different countries should be able to enter characters in their own languages, and Web applications should be able to display pages in multiple languages according to different
In the past two days, I took the time to summarize/sort out the actual encoding methods and usage of various encodings in Java applications. I will record them here for future reference. In order to form a complete understanding and in-depth
Previous wordsHTTP messages can host content in any language, as if it could host images, movies, or any type of media. For HTTP, the entity body is just a container for binary information. In order to support international content, the server needs
Previous wordsHTTP messages can host content in any language, as if it could host images, movies, or any type of MEDIA. For http, The entity body is just a container for binary Information. In order to support international content, the server needs
Before starting this article, I've already made a distinction between Unicode encoding (that is, code point) and Unicode encoding implementation. Otherwise, you will have no sense in the following.
History
We know that the ISO 10646 committee
NSI, UTF-8, Unicode, three encoded formats for character codes, one character can be encoded into ANSI, UTF-8, or Unicode format, and the three formats are only different in expression and represent the same content.
ANSI, UTF-8, Unicode
ANSI, UTF-
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.