PHP string length calculation-strlen () function usage

Source: Internet
Author: User

Strlen () and mb_strlen () Functions

In PHP, The strlen () function returns the length of the string. The function prototype is as follows:
Copy codeThe Code is as follows:
Int strlen (string string_input );

The string_input parameter is the string to be processed.

The strlen () function returns the length of a string in bytes. Each letter, number, and symbol occupies one byte. The length of each character is 1. A midday character occupies two bytes, so the length of a midday character is 2. For example
Copy codeThe Code is as follows:
<? Php
Echo strlen ("www.sunchis.com ");
Echo strlen ("sanzhi Development Network ");
?>

"Echo strlen (" www.sunchis.com ");" running result: 15

"Echo strlen (" sanzhi Development Network ");" running result: 15

I have a question: isn't a Chinese character in 2 bytes? The "sanzhi Development Network" clearly contains five Chinese characters. How can the running result be 15?

The reason is that when strlen () is calculated, the Chinese character of a UTF-8 is treated as 3 in length. In the case of a mix of Chinese and English, how can we accurately calculate the length of a string? Here, we need to introduce another function mb_strlen (). The usage of the mb_strlen () function is almost the same as that of strlen (), but the parameter of the specified character set encoding is added. Function prototype:
Copy codeThe Code is as follows:
Int mb_strlen (string string_input, string encode );

The built-in String Length function strlen in PHP cannot properly process Chinese strings. It only obtains the number of bytes occupied by strings. For GB2312 Chinese encoding, strlen obtains two times the number of Chinese characters, and for the UTF-8 encoding of Chinese, is three times the difference (in the UTF-8 encoding, A Chinese Character occupies 3 bytes ). Therefore, the following code can accurately calculate the length of a Chinese string:
Copy codeThe Code is as follows:
<? Php
$ Str = "sanzhi sunchis Development Network ";
Echo strlen ($ str). "<br>"; // result: 22
Echo mb_strlen ($ str, "UTF8"). "<br>"; // result: 12
$ Strlen = (strlen ($ str) + mb_strlen ($ str, "UTF8")/2;
Echo $ strlen; // result: 17
?>

Principle Analysis:

Strlen () calculation, the length of the Chinese characters for UTF-8 is 3, so the length of "Three known sunchis Development Network" is 5 × 3 + 7 × 1 = 22
When mb_strlen is calculated, if the selected inner code is UTF8, a Chinese character will be calculated as the length of 1. Therefore, the length of "sanzhi sunchis Development Network" is 5x1 + 7x1 = 12.

The rest is purely a mathematical problem, so I won't be so embarrassed here ......

Note:For mb_strlen ($ str, 'utf-8'), if the second parameter is omitted, PHP internal encoding is used. The internal encoding can be obtained through the mb_internal_encoding () function. It should be noted that mb_strlen is not a PHP core function. before using it, make sure that. the "php_mbstring.dll" line is loaded in ini to ensure that the "extension = php_mbstring.dll" line exists and is not commented out. Otherwise, the problem of undefined functions may occur.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.