// UTF-8 encoding characters can theoretically be up to 6 bytes long, but currently the world's
// It takes up to four bytes to add up the types of characters and symbols.
// The UTF-8 is 8-bit (that is, 1 byte) as the unit to encode the original code (Note
// Point: the original code mentioned here refers to the Unicode code), and stipulates that: Multi-byte code (2 Characters
// The number of consecutive "1" starting from the first 1st bytes after conversion (this
// Some consecutive "1" is called a flag bit), which indicates the number of bytes to be converted into: "110" two consecutive
// "1" indicates that the conversion result is 2 bytes, "1110" indicates 3 bytes, and "11110"
// Indicates 4 bytes ...... Followed by the "0" after the tag bit, which is used to separate the tag bit and
// Verification code bit. 2nd ~ The first two bits of the 4th bytes are fixed to "10", which is also used as the standard
// Note, the remaining 6 digits are used as the verification code bit.
// In this way, the 2-byte UTF-8 code has 11 remaining bytecode bits, which can be used to convert 0080 ~ 07ff's
// Original bytecode. The remaining 16 bytecode digits are 3 bytes, which can be used to convert 0800 ~ Original FFFF words
// Code, and so on. The Encoding template is as follows:
////
Original code (hexadecimal) UTF-8 encoding (Binary)
//--------------------------------------------
// 0000-007f 0 xxxxxxx
// 0080-07ff 110 XXXXX 10 xxxxxx
// 0800-FFFF 1110 XXXX 10 xxxxxx
10 xxxxxx
//......
//--------------------------------------------
////
"X" in the template indicates the signature code.
// ASCII Code <007f, which must be 1
// Byte UTF-8 code. The Unicode encoding range of Chinese characters is 0800-ffff, so it is encoded
// 3-byte UTF-8 code.
// For example, the Unicode encoding of the Chinese character is 6c49, And the 6c49 is between 0800-ffff, so
// Use a three-byte template: 1110 wwww 10 xxxxyy
10 yyzzzz.
// 6 C 4 9
// 0110 1100 0100 1001
// Wwww XXXX yyyy zzzz
// Wwww xxxxyy yyzzzz
// 1110 wwww 10 xxxxyy 10 yyzzzz.
// 11100110 10110001 10001001
// E 6 B 1 8 9
// The UTF-8 code of the Chinese character is E6 B1 89