In PHP, the strlen () function returns the length of the string. the strlen () function returns the length of the byte of the string. each letter, number, and symbol occupies one byte, their lengths are all 1 strlen () and mb_strlen () functions.
In PHP, the strlen () function returns the length of the string. The function prototype is as follows:
The code is as follows:
Int strlen (string string_input );
The string_input parameter is the string to be processed.
The strlen () function returns the length of a string in bytes. each letter, number, and symbol occupies one byte. the length of each character is 1. A midday character occupies two bytes, so the length of a midday character is 2. For example
The code is as follows:
Echo strlen ("www.sunchis.com ");
Echo strlen ("Sanzhi Development Network ");
?>
"Echo strlen (" www.sunchis.com ");" running result: 15
"Echo strlen (" Sanzhi Development Network ");" running result: 15
I have a question: isn't a Chinese character in 2 bytes? The "Sanzhi Development Network" clearly contains five Chinese characters. how can the running result be 15?
The reason is that when strlen () is calculated, the Chinese character of a UTF-8 is treated as 3 in length. In the case of a mix of Chinese and English, how can we accurately calculate the length of a string? Here, we need to introduce another function mb_strlen (). The usage of the mb_strlen () function is almost the same as that of strlen (), but the parameter of the specified character set encoding is added. Function prototype:
The code is as follows:
Int mb_strlen (string string_input, string encode );
The built-in string length function strlen in PHP cannot properly process Chinese strings. it only obtains the number of bytes occupied by strings. For GB2312 Chinese encoding, strlen obtains two times the number of Chinese characters, and for the UTF-8 encoding of Chinese, is three times the difference (in the UTF-8 encoding, A Chinese character occupies 3 bytes ). Therefore, the following code can accurately calculate the length of a Chinese string:
The code is as follows:
$ Str = "Sanzhi sunchis Development Network ";
Echo strlen ($ str )."
"; // Result: 22
Echo mb_strlen ($ str, "UTF8 ")."
"; // Result: 12
$ Strlen = (strlen ($ str) + mb_strlen ($ str, "UTF8")/2;
Echo $ strlen; // Result: 17
?>
Principle analysis:
Strlen () calculation, the length of the Chinese characters for UTF-8 is 3, so the length of "three known sunchis Development Network" is 5 × 3 + 7 × 1 = 22
When mb_strlen is calculated, if the selected inner code is UTF8, a Chinese character will be calculated as the length of 1. Therefore, the length of "Sanzhi sunchis Development Network" is 5x1 + 7x1 = 12.
The rest is purely a mathematical problem, so I won't be so embarrassed here ......
Note:For mb_strlen ($ str, 'utf-8'), if the second parameter is omitted, PHP internal encoding is used. The internal encoding can be obtained through the mb_internal_encoding () function. It should be noted that mb_strlen is not a PHP core function. before using it, make sure that. the "php_mbstring.dll" line is loaded in ini to ensure that the "extension = php_mbstring.dll" line exists and is not commented out. Otherwise, the problem of undefined functions may occur.