The difference between strlen and Mb_strlen functions in PHP

Source: Internet
Author: User
Tags strlen

In PHP, strlen and Mb_strlen are functions for string lengths, but for some beginners, if you don't read the manual, you may not know the difference ...
First look at the example:


<?php

$str = ' Chinese a word 1 characters ';

echo strlen ($str). ' <br> ';//14

Echo Mb_strlen ($str, ' UTF8 '). ' <br> ';//6

Echo Mb_strlen ($str, ' GBK '). ' <br> ';//8

Echo Mb_strlen ($str, ' gb2312 '). ' <br> ';//10
>

Analysis: In strlen calculations, to treat a UTF8 Chinese character is 3 length, so the length of "A word 1" is 3*4+2=14, in Mb_strlen calculation, the selection of the inner code as UTF8, will be a Chinese character as a length to calculate, so "a character 1 characters" Length is 6.

Using these two functions, you can jointly calculate the amount of a string that is mixed in Chinese and English (the placeholder for a Chinese character is 2, and the English character is 1).


Echo (strlen ($STR) + Mb_strlen ($str, ' UTF8 '))/2;

For example, "Chinese a character 1" strlen ($STR) value is the 14,mb_strlen ($STR) value is 6, you can calculate the "Chinese a word 1" occupies the position is 10.

PHP's built-in string length function strlen does not handle the Chinese string correctly, it gets just the number of bytes in the string. For GB2312 Chinese encoding, strlen gets twice times the number of Chinese characters, and for UTF-8 encoded Chinese, it is 3 times times the difference (in UTF-8 code, a Chinese character accounts for 3 bytes).

Add:

The code is as follows (code is encoded as UTF-8):

<?php

$str 1 = ' www.111cn.net ';

$str 2 = ' Misty rain net ';

$str 3 = ' Misty rain net 111cn.net ';

Echo Mb_strlen ($str 1). ' <br>/Result 15

Echo Mb_strlen ($str 2). ' <br>/Result 6

Echo Mb_strlen ($str 3). ' <br>/Result 17

Echo '--------1-------------<br> ';

Echo strlen ($str 1). ' <br>/Result 15

Echo strlen ($str 2). ' <br>/Result 6

Echo strlen ($str 3). ' <br>/Result 17

Echo '--------utf-8-------------<br> ';

Echo Mb_strlen ($str 1, ' utf-8 '). ' <br>/Result 15

Echo Mb_strlen ($str 2, ' Utf-8 '). ' <br>/Result 3

Echo Mb_strlen ($str 3, ' Utf-8 '). ' <br>/Result 14

Echo '--------gbk-------------<br> ';

Echo Mb_strlen ($str 1, ' GBK '). ' <br>/Result 15

Echo Mb_strlen ($str 2, ' GBK '). ' <br>/Result 5

Echo Mb_strlen ($str 3, ' GBK '). ' <br>/Result 15

Echo '--------gb2312-------------<br> ';

Echo Mb_strlen ($str 1, ' gb2312 '). ' <br>/Result 15

Echo Mb_strlen ($str 2, ' gb2312 '). ' <br>/Result 5

Echo Mb_strlen ($str 3, ' gb2312 '). ' <br>/Result 16

?>

So far, only two points have come to the conclusion:

1. When it comes to English letters, strlen and Mb_strlen can be generic, coding is different, and two function results are the same.

2. When it comes to Chinese, the encoding will affect the length of the characters, even if the Chinese for GBK and GB2312 are different manifestations.

3. Code coding for UTF-8, otherwise the results will be different from mine, when for other encodings, such as ANSI when the result value will be another value, and it should be noted that Mb_strlen is not a PHP core function, the need to load extensions, the code results are as follows:

(Code encoded as ANSI)

<?php

$str 1 = ' www.111cn.net ';

$str 2 = ' Misty rain net ';

$str 3 = ' Misty rain net 111cn.net ';

Echo Mb_strlen ($str 1). ' <br>/Result 15

Echo Mb_strlen ($str 2). ' <br>/Result 6

Echo Mb_strlen ($str 3). ' <br>/Result 17

Echo '--------1-------------<br> ';

Echo strlen ($str 1). ' <br>/Result 15

Echo strlen ($str 2). ' <br>/Result 6

Echo strlen ($str 3). ' <br>/Result 17

Echo '--------utf-8-------------<br> ';

Echo Mb_strlen ($str 1, ' utf-8 '). ' <br>/Result 15

Echo Mb_strlen ($str 2, ' Utf-8 '). ' <br>/Result 3

Echo Mb_strlen ($str 3, ' Utf-8 '). ' <br>/Result 14

Echo '--------gbk-------------<br> ';

Echo Mb_strlen ($str 1, ' GBK '). ' <br>/Result 15

Echo Mb_strlen ($str 2, ' GBK '). ' <br>/Result 3

Echo Mb_strlen ($str 3, ' GBK '). ' <br>/Result 14

Echo '--------gb2312-------------<br> ';

Echo Mb_strlen ($str 1, ' gb2312 '). ' <br>/Result 15

Echo Mb_strlen ($str 2, ' gb2312 '). ' <br>/Result 3

Echo Mb_strlen ($str 3, ' gb2312 '). ' <br>/Result 14

?>

In addition, for each encoding the character length of the effect of the law is still in the test, I put my test results here, welcome to know friends tell me, thank you!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.