Often listen to others say that coding inconsistent is the cause of garbled. This is the standard answer, but not necessarily the answers you want, because you do not understand Ah! That's a little more clear.
Do you know how a Chinese character is transmitted over a network? For example, the word "China", guess also know that in the transmission process is certainly not "China" such characters, but the byte, that is, 0|1 such as binary number.
You need to convert "China" into a binary number for transmission by some encoding, and the receiver will decode the "China" according to the corresponding encoding after the binary number is received. The question is whether the encoding used by both parties is consistent.
What is the encoding method? Have you heard of the character set? The common character set has utf-8,gbk,gb2312,iso8859-1 and the like. Each character set establishes a mapping between the characters and bytes it contains. And a single byte is 8 bits. So you can also think of a character set as a formula or a mapping table for converting characters and binary numbers.
"China" with GBK to binary number is 11010110110100001011100111111010 (in GBK each Chinese character corresponds to two bytes, each byte corresponds to a 8-bit binary number, so this string number should be 32 bits, corresponding to "China" two Chinese characters, Don't believe you can count, hehe. ), and then this string of binary number with UTF-8 converted into characters, it is certainly not the original "China" (because in the UTF-8 character set, a character corresponds to three bytes, so with the toe can be guessed definitely not back to "China"). This is what I understand the cause of the Chinese garbled.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
A brief introduction to the causes of Chinese garbled characters