When using the Mb_detect_encoding function in PHP to encode recognition, many people have encountered the problem of identifying the coding error, for example, with GB2312 and UTF-8, or UTF-8 and GBK (here is mainly for cp936 judgment), online said that because the character is short, Mb_detect_encoding will have a miscarriage of error.
For example:
Copy CodeThe code is as follows:
$encode = mb_detect_encoding ($keytitle, Array ("ASCII", ' utf-8′, "gb2312′," GBK ", ' big5′));
if ($encode = = "Utf-8″) {
$keytitle = Iconv ("Utf-8″," GBK ", $keytitle);
}
The purpose of this code is to detect whether the encoding of a string is UTF-8, and then convert to GBK.
But when $keytitle = "%d0%be%c6%ac"; The test results are UTF-8. This bug is not really a bug, writing programs should not be too dependent on mb_detect_encoding, when the string is shorter, the detection result is very likely to deviate.
How to solve it, my way is:
Copy CodeThe code is as follows:
$encode = mb_detect_encoding ($keytitle, Array (' ASCII ', ' gb2312′, ' GBK ', ' UTF-8 ');
Three parameters are: The detected input variables, the encoding method of the detection sequence (once true, followed by auto-ignore), strict mode
Adjust the order of the encoding detection to put the maximum probability in front, thus reducing the chance of being wrongly converted.
Generally to first row gb2312, when there are GBK and UTF-8, you need to put the usual arrangement to the front.
http://www.bkjia.com/PHPjc/323437.html www.bkjia.com true http://www.bkjia.com/PHPjc/323437.html techarticle when using the Mb_detect_encoding function in PHP to encode recognition, many people have encountered the problem of identifying the coding error, for example, with GB2312 and UTF-8, or UTF-8 and GBK (mainly on ...