Phpgbk conversion utf8 loss of characters and garbled solutions. In php, if we convert a uft8 string to gbk or gb2312, garbled characters may be lost because of the gbk encoding range and uft8 encoding range problems, below we briefly list a problem that if we convert a uft8 string to gbk or gb2312 in php, garbled code or loss will occur because of the gbk encoding range and uft8 encoding range problems, the following is a simple list of gbk and utf8 encoding range tables. you can see the reason.
I. encoding range
1. GBK (GB2312/GB18030)
X00-xff GBK dubyte encoding range
X20-x7f (ASCII)
Xa1-xff (Chinese)
X80-xff (Chinese)
2. UTF-8 (Unicode)
U4e00-u9fa5)
X3130-x318F (Korean
XAC00-xD7A3 (Korean)
U0800-u4e00 (Japanese)
Ps: Korean is a character greater than [u9fa5]
Example
The code is as follows: |
|
$ C = 'Test • character transfer • Happy May Day! '; Echo iconv ('utf-8', 'gbk', $ c ); |
Only output: After the test, the plenary session is lost "."
Solution:
Add // IGNORE
The code is as follows: |
|
$ C = 'Test • character transfer • Happy May Day! '; Echo iconv ('utf-8', 'gbk // IGNORE ', $ c ); |
Input: test character transfer for Happy May Day!
Example 2
The code is as follows: |
|
Echo $ str = 'Hi, it's coffee sale! '; Echo' '; Echo iconv ('gb2312', 'utf-8', $ str); // Encode the string from GB2312 to UTF-8 Echo' '; Echo iconv_substr ($ str, 1, 1, 'utf-8'); // truncate by number of characters rather than bytes Print_r (iconv_get_encoding (); // Obtain the encoding information of the current page. Echo iconv_strlen ($ str, 'utf-8'); // you can specify the length of the encoded string. ?> |
...