The Mb_strlen function specifies different encoding output results, which Master hero explains the reason
page encoding Utf-8
$text = "Ah ah ah ah"; Echo Mb_strlen ($text, ' UTF8 '). "
"Echo Mb_strlen ($text, ' GBK ')."
"Echo Mb_strlen ($text, ' gb2312 ')."
"; Echo strlen ($text);
Output: 4 6 8 12
page encoding gb2312
$text = "Ah ah ah ah"; Echo Mb_strlen ($text, ' UTF8 '). "
"Echo Mb_strlen ($text, ' GBK ')."
"Echo Mb_strlen ($text, ' gb2312 ')."
"; Echo strlen ($text);
Output: 4 4 4 8
Reply to discussion (solution)
You specify the wrong encoding, and you don't get the right results.
This is the PHP website character set list
http://www.php.net/manual/en/mbstring.supported-encodings.php
Mb_internal_encoding ("UTF-8");
Echo mb_internal_encoding ();
"Ah ah ah ah" byte 16 binary represented as
UTF-8: E5 8A E5 8A E5 8A E5 12 8A---
Gb2312:b0 A1 B0 A1 B0 A1 B0 A1---8
At the time of Utf-8
Utf-8 [E5 8A] [E5 8A] [E5 8A] [E5] 8A]---4
GBK [E5 95]? [8A E5]? [8A]? [E5 95]? [8A E5]? [8A]? ---6
gb2312 [E5 95]? [8A] [E5 95]? [8A] [E5 95]? [8A] [E5 95]? [8A]---8
Note: 8A does not exist at the beginning of gb2312 (minimum A1), so the independent calculation of
At the time of gb2312
Utf-8 (indeterminate) because there is no utf-8 character at the beginning of the B0 byte, I guess MB is "intelligently" calculated in double-byte---4
gbk/gb2312 [B0 A1] [B0 A1] [B0 A1] [B0 A1]---4
Measured results (PHP 5.4.12)
Utf-8 4 6 8 12
GB2312 8 4 4 8
No explanation is needed, and the correct result can be obtained only in the correct character set.