An example of differences between String Length functions strlen and mb_strlen: strlenmb_strlen

Source: Internet
Author: User

An example of differences between String Length functions strlen and mb_strlen: strlenmb_strlen

Common functions used in php to calculate the string length include strlen and mb_strlen. When the characters are all English characters, the two are the same. Here we will mainly compare the two calculation results when mixing Chinese and English.

In PHP, strlen and mb_strlen are functions used to evaluate the string length. However, for some beginners, if they do not read the manual, they may not know the difference.
The following is an example to illustrate the differences between the two.

First look at the example:

<? Php // The encoding method of the file during the test is UTF8 $ str = 'Chinese character a 1'; echo strlen ($ str ). '<br>'; // 14 echo mb_strlen ($ str, 'utf8 '). '<br>'; // 6 echo mb_strlen ($ str, 'gbk '). '<br>'; // 8 echo mb_strlen ($ str, 'gb2312 '). '<br>'; // 10?>

Result Analysis: During strlen calculation, the Chinese character of UTF8 is 3 characters in length, so the length of "Chinese character a 1 character" is 3*4 + 2 = 14, when mb_strlen is calculated, if the selected inner code is UTF8, a Chinese character will be calculated as the length of 1. Therefore, the length of "Chinese a character 1 character" is 6.

The two functions can be used to calculate the placeholder value of a string in both Chinese and English (the placeholder value of a Chinese character is 2 and that of an English character is 1)
Echo (strlen ($ str) + mb_strlen ($ str, 'utf8')/2;

For example, the strlen ($ str) Value of "Chinese character a 1 character" is 14, and the mb_strlen ($ str) value is 6, the placeholder Value of "Chinese character a 1 character" is 10.

echo mb_internal_encoding();

The built-in String Length function strlen in PHP cannot properly process Chinese strings. It only obtains the number of bytes occupied by strings. For GB2312 Chinese encoding, strlen obtains two times the number of Chinese characters, and for the UTF-8 encoding of Chinese, is three times the difference (in the UTF-8 encoding, A Chinese Character occupies 3 bytes ).

Using the mb_strlen function can better solve this problem. The usage of mb_strlen is similar to that of strlen, except that it has a second optional parameter for specifying character encoding. For example, you can use mb_strlen ($ str, 'utf-8') to get the $ str length of the UTF-8 '). If the second parameter is omitted, the internal code of PHP is used. The internal encoding can be obtained through the mb_internal_encoding () function.

It should be noted that mb_strlen is not a PHP core function. before using it, make sure that. the "php_mbstring.dll" line is loaded in ini to ensure that the "extension = php_mbstring.dll" line exists and is not commented out. Otherwise, the number of undefined functions may occur.


The strlen function measures the string length to include the end identifier.

The strlen function does not include the length of '\ 0'. The result of sizeof calculation includes the length of' \ 0:
Char str [] = "This is a test string two ";
Cout <"str sizeof is:" <sizeof (str) <endl;
Cout <"str strlen is:" <strlen (str) <endl;

Output:
Str sizeof is: 26
Str strlen is: 25

Let's look at the example above to make it clearer!

What are the roles of strlen () and mb_strlen?

In PHP, strlen and mb_strlen are functions used to evaluate the string length. However, for some beginners, if they do not read the manual, they may not know the difference.
The following is an example to illustrate the differences between the two.
First look at the example:
<? Php // The encoding method of the file during the test is UTF8 $ str = 'Chinese character a 1'; echo strlen ($ str ). '<br>'; // 14 echo mb_strlen ($ str, 'utf8 '). '<br>'; // 6 echo mb_strlen ($ str, 'gbk '). '<br>'; // 8 echo mb_strlen ($ str, 'gb2312 '). '<br>'; // 10?>

Result Analysis: During strlen calculation, the Chinese character of UTF8 is 3 characters in length, so the length of "Chinese character a 1 character" is 3*4 + 2 = 14, when mb_strlen is calculated, if the selected inner code is UTF8, a Chinese character will be calculated as the length of 1. Therefore, the length of "Chinese a character 1 character" is 6.
The two functions can be used to calculate the placeholder value of a string in both Chinese and English (the placeholder value of a Chinese character is 2 and that of an English character is 1)

Echo (strlen ($ str) + mb_strlen ($ str, 'utf8')/2;

For example, the strlen ($ str) Value of "Chinese character a 1 character" is 14, and the mb_strlen ($ str) value is 6, the placeholder Value of "Chinese character a 1 character" is 10.

Echo mb_internal_encoding ();

The built-in String Length function strlen in PHP cannot properly process Chinese strings. It only obtains the number of bytes occupied by strings. For GB2312 Chinese encoding, strlen obtains two times the number of Chinese characters, and for the UTF-8 encoding of Chinese, is three times the difference (in the UTF-8 encoding, A Chinese Character occupies 3 bytes ).
Using the mb_strlen function can better solve this problem. The usage of mb_strlen is similar to that of strlen, except that it has a second optional parameter for specifying character encoding. For example, you can use mb_strlen ($ str, 'utf-8') to get the $ str length of the UTF-8 '). If the second parameter is omitted, the internal code of PHP is used. The internal encoding can be obtained through the mb_internal_encoding () function.
It should be noted that mb_strlen is not a PHP core function. before using it, make sure that. the "php_mbstring.dll" line is loaded in ini to ensure that the "extension = php_mbstring.dll" line exists and is not commented out. Otherwise, the number of undefined functions may occur.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.