Differences between strlen () and mb_strlen () in PHP

This article mainly introduces the differences between strlen () and mb_strlen () in PHP. This article discusses the differences between Chinese and English mixed characters when using this function, for more information about how to calculate the string length in php, see strlen and mb_strlen. when all the characters are English characters, the two are the same. Here we will mainly compare the two calculation results when mixing Chinese and English.

Let's look at an example:

The code is as follows:
<? Php
// During the test, the file encoding method is UTF8.
$ Str = 'Chinese character a 1 ';
Echo strlen ($ str ).'
'; // 14
Echo mb_strlen ($ str, 'utf8 ').'
'; // 6
Echo mb_strlen ($ str, 'gbk ').'
'; // 8
Echo mb_strlen ($ str, 'gb2312 ').'
'; // 10

Result analysis: during strlen calculation, the Chinese character of UTF8 is 3 characters in length, so the length of "Chinese character a 1 character" is 3*4 + 2 = 14, when mb_strlen is calculated, if the selected inner code is UTF8, a Chinese character is regarded as 1 in length. Therefore, the length of "Chinese a character 1 character" is 6.

The two functions can be used to calculate the placeholder value of a string in both Chinese and English (the placeholder value of a Chinese character is 2 and that of an English character is 1)
The code is as follows:
Echo (strlen ($ str) + mb_strlen ($ str, 'utf8')/2;

For example, the strlen ($ str) value of "Chinese character a 1 character" is 14, and the mb_strlen ($ str) value is 6, then, the placeholder value of "Chinese character a 1 character" can be calculated as 10:

The built-in string length function strlen in PHP cannot properly process Chinese strings. it only obtains the number of bytes occupied by strings.

For GB2312 Chinese encoding, strlen obtains two times the number of Chinese characters, and for the UTF-8 encoding of Chinese, is three times the difference (in the UTF-8 encoding, A Chinese character occupies 3 bytes ).

Using the mb_strlen function can better solve this problem.

The usage of mb_strlen is similar to that of strlen, except that it has a second optional parameter for specifying character encoding.

For example, you can use mb_strlen ($ str, 'utf-8') to get the $ str length of the UTF-8 '). If the second parameter is omitted, the internal code of PHP is used. The internal encoding can be obtained through the mb_internal_encoding () function.

Note: mb_strlen is not a PHP core function. before using it, make sure that php_mbstring.dll is loaded in php. ini.
Make sure that the line "extension = php_mbstring.dll" exists and is not commented out. Otherwise, the problem of undefined functions may occur.

