This article to introduce to you about PHP judgment string encoding is Utf-8 program code, if you are interested in not to enter the reference.
We used to use mb_detect_encoding () This function detects character encoding
The code is as follows |
Copy Code |
Determine what encoding a string is if ($tag = = = Mb_convert_encoding (mb_convert_encoding ($tag, "GB2312", "UTF-8"), "UTF-8", "GB2312")) { } else {//if the gb2312 is converted to UTF8 $tag = mb_convert_encoding ($tag, ' UTF-8 ', ' GB2312 '); } |
$keytitle = "%d0%be%c6%ac"; The test results are UTF-8. This bug is not really a bug, writing programs should not be too dependent on mb_detect_encoding, when the string is shorter, the detection result is very likely to deviate.
How to solve it, my way is:
The code is as follows |
Copy Code |
$encode = mb_detect_encoding ($keytitle, Array (' ASCII ', ' gb2312′, ' GBK ', ' UTF-8 '); |
Parameters are: The detected input variables, the encoding method of the detection sequence (once true, followed by auto-ignore), strict mode
Adjust the order of the encoding detection to put the maximum probability in front, thus reducing the chance of being wrongly converted.
The above method or can not solve, the following found a solution.
Example 1
The code is as follows |
Copy Code |
Returns true if $string is valid UTF-8 and False otherwise. function Is_utf8 ($word) { if (Preg_match ("/^ ([". chr (228). " -". Chr (233)." {1} [". chr (128)." -". Chr (191)." {1} [". chr (128)." -". Chr (191)." {1}) {1}/", $word) = = True | | Preg_match ("/[". chr (228). " -". Chr (233)." {1} [". chr (128)." -". Chr (191)." {1} [". chr (128)." -". Chr (191)." {1}) {1}$/", $word) = = True | | Preg_match ("/[". chr (228). " -". Chr (233)." {1} [". chr (128)." -". Chr (191)." {1} [". chr (128)." -". Chr (191)." {1}) {2,}/", $word) = = True) { return true; } Else { return false; } }//Function Is_utf8 |
http://www.bkjia.com/PHPjc/632765.html www.bkjia.com true http://www.bkjia.com/PHPjc/632765.html techarticle this article to introduce to you about PHP judgment string encoding is Utf-8 program code, if you are interested in not to enter the reference. We used to use mb_detect_encoding () This letter ...