The phpmb_strlen function specifies different encoding output results. in this post, lylgxy2007wht is used to edit the mb_strlen function at 2013-04-02:37:02 to specify different encoding output results, which of the following is the reason why the page code UTF-8 $ text & nbsp; ah; & nbsp; echo & nbsp; mb_strl php mb_strlen function specifies different encoding output results?
This post was last edited by lylgxy2007wht at 11:37:02
The mb_strlen function specifies that the output results of different codes are different.
Page encoding UTF-8
$ Text = "aha ";
Echo mb_strlen ($ text, 'utf8 ')."
";
Echo mb_strlen ($ text, 'gbk ')."
";
Echo mb_strlen ($ text, 'gb2312 ')."
";
Echo strlen ($ text );
Output: 4 6 8 12
Page code gb2312
$ Text = "aha ";
Echo mb_strlen ($ text, 'utf8 ')."
";
Echo mb_strlen ($ text, 'gbk ')."
";
Echo mb_strlen ($ text, 'gb2312 ')."
";
Echo strlen ($ text );
Output: 4 4 4 8
Php
------ Solution --------------------
This is the PHP official website Character Set list
Http://www.php.net/manual/en/mbstring.supported-encodings.php
Mb_internal_encoding ("UTF-8 ");
Echo mb_internal_encoding ();
------ Solution --------------------
The hexadecimal representation of "ah" is
UTF-8: E5 95 8A E5 95 8A E5 95 8A E5 95 8A --- 12
GB2312: B0 A1 B0 A1 B0 A1 B0 A1 --- 8
In UTF-8
UTF-8 [E5 95 8A] [E5 95 8A] [E5 95 8A] [E5 95 8A] --- 4
Gbk [E5 95] California [8A E5] California [95 8A] California [E5 95] California [8A E5] California [95 8A] California --- 6
Gb2312 [E5 95] California [8A] [E5 95] California [8A] [E5 95] California [8A] [E5 95] California [8A] --- 8
Note: 8A does not start with gb2312 (minimum A1), so it is calculated independently.
At gb2312
UTF-8 (not sure) because there is no UTF-8 character starting with B0 bytes, I guess mb is "intelligently" calculated by double byte --- 4
Gbk/gb2312 [B0 A1] [B0 A1] [B0 A1] [B0 A1] --- 4
------ Solution --------------------
Test results (php 5.4.12)
UTF-8: 4 6 8 12
Get 8 4 4 8 under gb2312
No need to explain it. only the correct character set can get the correct result.