In programming, these three character encoding formats are often converted to each other, so that many third-party libraries fail to call for unknown reasons, in fact, many are because third-party libraries support UTF-8 rather than Windows Default support of the UTF-16 caused.
The following describes the three common character encoding methods in windows.
Gb2312
Label: Style Color SP problem BS as TT program nbsp
Represent "Han" with UTF-16"
If the UTF-16 is expressed as 01101100 01001001 (16 bits in total, two bytes). When the program parses the known is the UTF-16, the two bytes as a unit to parse. This is very simple.
Represent "Han" with UTF-
The Perfect 2 php detection string is utf-8 encoded function sharing, string utf-8
In PHP development sometimes use transcoding functions, such as iconv (), mb_convert_encoding () function, when using the function transcoding or decoding time we sometimes need to determine the current string encoding type, rather than
Document directory
ASCII
GB2312
GBK
UTF Encoding
Setlocale
Miserable programmer
ASCII
Table language English and Western European language.
ASCII is represented by 7 characters, which can represent 128 characters. Its Extension uses 8 characters to represent 256 characters.
ASCII from 00 to 7F, extended from 00 to FF.
GB2312
Simplified Chinese character set. Compatible with ASCII codes,
It is ex
UTF-8 format:
Note: X represents a value of 0 or 1. The range field is in hexadecimal notation, And the encoding form field is in binary notation.
Range encoding format
0x000000000000-0x0000007f 0 xxxxxxx
0x00000080-0x000007ff 110 XXXXX, 10 xxxxxx
0x00000800-0x0000ffff 1110 XXXX, 10 xxxxxx, 10 xxxxxx
0x0000000-0x0010ffff 11110xxx, 10 xxxxxx, 10 xxxxxx, 10 xxxxxx
The
UNICODE: The encoding mechanism developed by unicode.org should include common texts all over the world.In 1.0, It is a 16-bit code, from u + 0000 to U + FFFF. each 2byte Code corresponds to one character. At the beginning of 2.0, the 16-bit limit was abandoned. The original 16-bit is used as the basic bit plane, and the 16-bit plane is added, which is equivalent to 20-bit encoding, the encoding range is 0 to 0x10ffff.
UCs:The universal character set defined in iso000046 according to ISO, which
C + + Builder comes with JSON processing that supports UNICODE (UTF-8/UTF-16):Http://www.cppfans.com/cbknowledge/skills/files/json.aspJSON parsing and generation-file handling? JSON parsing methods and rules, reading JSON-specific values, the inheritance relationship of JSON classes? JSON class tjsonvalue| Tjsonobject| tjsonpair| tjsonarray| Tjsonstring| tjsonnum
What is the difference between Unicode, UTF-8, and ISO8859-1? utf-8iso8859-1
Note: This article is reproduced on Sina Blog to facilitate knowledge summarization. Address: http://blog.sina.com.cn/s/blog_673c81990100t1lc.html
This article mainly includes the following aspects: Basic coding knowledge, java, system software, url, tool software, etc.
In the following
The original objective of Unicode is to use a 16-bit encoding to provide ing for over 65000 characters. However, this is not enough. It cannot cover all historical texts or solve the implantation head-ache problem, especially in network-based applications. The existing software must do a lot of work to program 16-bit data.Therefore, Unicode uses three encoding methods with some basic reserved characters. They are UTF-
In an xss problem encountered in this encoding problem, utf-7 and mutf-7 this encoding is used by mail, javamail should be supported, using utf-7 this is still relatively rare encoding can be used for some xss applications. Using Native jdk reports an error that does not support utf-7 encoding, which is a bug in sun jdk. Open-source jcharset. jar is a solution. Y
When your use gem is the install package, the Result:ERROR:While executing gem ... (Encoding::undefinedconversionerror)U+7CFB to IBM437 in conversion from Utf-16le to UTF-8 to IBM437Workaround:Go to local path to modify the code, so the hint is due to inconsistent coding.To modify the encoding in the registry file:There will be no problem after the change.Error:w
PHP implementation Utf-8 to Unicode function sharing, Utf-8unicode
The code is very simple, the function is very practical, recommend to everybody.
The Code First:
Copy CodeThe code is as follows:Public Function Utf8_unicode ($STR) {$unicode = Array ();$values = Array ();$lookingFor = 1;for ($i = 0; $i $thisValue = Ord ($str [$i]);if ($thisValue Exclude 0-9if ($
Mysql Default encoding for UTF-8 by modifying the my. ini implementation method, utf-8my.ini
Why mysql Chinese characters are garbled
Mysql Default encoding is Latin1 is the alias of the I-8859-1, but Latin1 does not support Chinese characters, so you need to change it to UTF-8
VIM display UTF-8 document garbled solution, vimutf-8
1. Basic Knowledge
In Vim, there are four encoding-related options: fileencodings, fileencoding, encoding, and termencoding. In actual use, any option error may cause garbled characters. Therefore, each Vim user should clarify the meaning of these four options. The following describes in detail the meanings a
The implementation method of UTF-8 coding through javascript, javascriptutf-8
Javascript character set:
Javascript programs are written using the Unicode Character Set. Unicode is a superset of ASCII and Latin-1, and supports almost all languages on the earth. ECMAScript3 requires JavaScript to support Unicode2.1 and later versions, while ECMAScript5 requires Uni
I wrote it in Notepad, after the save on the Web page ran unexpectedly garbled. Replace with GB2312 to display Chinese correctly.
But after all, the standards are different. Still need to use the Utf-8.Finally found that the code is not a problem, the problem is out of Notepad .
PHP how to achieve Unicode and Utf-8 encoding mutual conversion, unicodeutf-8
Recently, I used unicode encoding conversion. I checked the php library function and did not find a function that can encode and decode the string! If you cannot find it, implement it yourself...Differences between Unicode and Utf-
becoming increasingly inconspicuous and gradually filled out. The text that is not included is generally not implemented in any other character set, and cannot be used in XML even if it is included. In the best case, the text is transferred to a single-byte character set such as Latin-1 through the font borrow. Real support for such rare texts may come from Unicode first, and only Unicode supports them.
However, this is only one reason to use Unicode. Why choose
Character encoding ASCII, Unicode and UTF-8, asciiutf-8
Http://blog.csdn.net/pipisorry/article/details/42387045
ASCII code
The ASCII code consists of A total of 128 characters. For example, the SPACE is 32 (Binary 00100000), and the uppercase letter A is 65 (Binary 01000001 ). These 128 symbols (including 32 control symbols that cannot be printed) only occupy the
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.