PHP generally uses the Mb_detect_encoding function for string coding, but many people have encountered the problem of identifying coding errors, such as with GB2312 and UTF-8, or UTF-8 and GBK (which is mainly about cp936), Online said is because the character is short, mb_detect_encoding will appear false.
For example:
$encode = mb_detect_encoding ($keytitle, Array ("ASCII", ' utf-8′, "gb2312′," GBK ", ' big5′)); if ($encode = =" Utf-8″ ") {$ Keytitle = Iconv ("Utf-8″," GBK ", $keytitle);}
The purpose of this code is to detect whether the encoding of a string is UTF-8, and then convert to GBK.
But when $keytitle = '%d0%be%c6%ac '; When The test results are UTF-8. This bug is not really a bug, writing programs should not be too dependent on mb_detect_encoding, when the string is shorter, the detection result is very likely to deviate.
How to solve it, my way is:
$encode = mb_detect_encoding ($keytitle, Array (' ASCII ', ' gb2312′, ' GBK ', ' UTF-8 ');
Three parameters are: The detected input variables, the encoding method of the detection sequence (once true, followed by auto-ignore), strict mode
Adjust the order of the encoding detection to put the maximum probability in front, thus reducing the chance of being wrongly converted.
Generally to first row gb2312, when there are GBK and UTF-8, you need to put the usual arrangement to the front.