' Compilation ', ' compilation ', why the byte array length obtained is not the same
Http://www.cnblogs.com/yongdaimi/p/5899328.html
Unicode official website
http://unicode.org/
Utf-8 Kanji Comparison Table
Http://blog.chinaunix.net/uid-25544300-id-3281847.html
Reference to inner and outer codes
https://www.zhihu.com/question/27562173
Code unit and Code point
http://www.jianshu.com/p/a7db6ac53d57
Coding problem, write very fine, but at present do not understand
Http://www.fmddlmyy.cn/text6.html
Unicoce code, also known as UCS
The scientific name for Unicode is "Universal multiple-octet Coded Character Set", referred to as UCS. UCS can be seen as an abbreviation for "Unicode Character Set".
UCS just rules how to encode, and does not specify how to transfer and save this code. For example, the "Han" word of the UCS code is 6c49, I can use 4 ASCII numbers to transmit, save the code, or can be encoded with Utf-8:3 consecutive bytes E6 B1 89来 represents it. The key is that both parties must endorse the communication. UTF-8, UTF-7 and UTF-16 are widely accepted programs. A particular benefit of UTF-8 is that it is fully compatible with iso-8859-1. UTF is the abbreviation for "UCS transformation Format".
Java character in-depth knowledge, pending finishing