The common functions for calculating string lengths in PHP are: strlen and Mb_strlen. When the word Fu Quan is an English character, the two are the same. Here the main comparison, in Chinese and English mixed row, two results.
In PHP, strlen and Mb_strlen are functions that ask for string lengths, but for some beginners, it may not be clear what the difference is if you don't read the manual.
Here's an example to explain the difference between the two.
First look at the example:
2 |
How the file is encoded when tested if UTF8 |
3 |
$str = ' Chinese a word 1 characters '; |
4 |
echo strlen ($str). ' <br> ';//14 |
5 |
Echo Mb_strlen ($str, ' UTF8 '). ' <br> ';//6 |
6 |
Echo Mb_strlen ($str, ' GBK '). ' <br> ';//8 |
7 |
Echo Mb_strlen ($str, ' gb2312 '). ' <br> ';//10 |
Result analysis: In strlen calculation, to treat a UTF8 Chinese character is 3 length, so the length of "A word 1" is 3*4+2=14, in Mb_strlen calculation, the selection of the inner code as UTF8, will be a Chinese character as a length to calculate, so "a character 1 characters" Length is 6.
Using these two functions, you can jointly calculate the amount of a string that is mixed in Chinese and English (the placeholder for a Chinese character is 2, and the English character is 1).
Echo (strlen ($STR) + Mb_strlen ($str, ' UTF8 '))/2;
For example, "Chinese a character 1" strlen ($STR) value is the 14,mb_strlen ($STR) value is 6, you can calculate the "Chinese a word 1" occupies the position is 10.
1 |
Echo mb_internal_encoding (); |
PHP's built-in string length function strlen does not handle the Chinese string correctly, it gets just the number of bytes in the string. For GB2312 Chinese encoding, strlen gets twice times the number of Chinese characters, and for UTF-8 encoded Chinese, it is 3 times times the difference (in UTF-8 code, a Chinese character occupies 3 bytes).
Using the Mb_strlen function can solve this problem better. The use of Mb_strlen is similar to strlen, except that it has a second optional parameter for specifying the character encoding. For example, to get the UTF-8 string $str length, you can use Mb_strlen ($str, ' UTF-8 '). If the second argument is omitted, PHP's internal encoding is used. The internal code can be obtained by the mb_internal_encoding () function.
It's important to note that Mb_strlen is not a PHP core function, and you need to make sure that the "Extension=php_mbstring.dll" line exists and is not commented out, before you use it, to ensure that the Php_mbstring.dll is loaded in php.ini. Otherwise, there is an issue with undefined functions.