String length functions strlen and MB

Source: Internet
Author: User
Tags mixed strlen

The common functions for calculating string lengths in PHP are: strlen and Mb_strlen. When the word Fu Quan is an English character, the two are the same. Here the main comparison, in Chinese and English mixed row, two results.

In PHP, strlen and Mb_strlen are functions that ask for string lengths, but for some beginners, it may not be clear what the difference is if you don't read the manual.
Here's an example to explain the difference between the two.

First look at the example:

1 <?php
2 How the file is encoded when tested if UTF8
3 $str = ' Chinese a word 1 characters ';
4 echo strlen ($str). ' <br> ';//14
5 Echo Mb_strlen ($str, ' UTF8 '). ' <br> ';//6
6 Echo Mb_strlen ($str, ' GBK '). ' <br> ';//8
7 Echo Mb_strlen ($str, ' gb2312 '). ' <br> ';//10
8 ?>

Result analysis: In strlen calculation, to treat a UTF8 Chinese character is 3 length, so the length of "A word 1" is 3*4+2=14, in Mb_strlen calculation, the selection of the inner code as UTF8, will be a Chinese character as a length to calculate, so "a character 1 characters" Length is 6.

Using these two functions, you can jointly calculate the amount of a string that is mixed in Chinese and English (the placeholder for a Chinese character is 2, and the English character is 1).
Echo (strlen ($STR) + Mb_strlen ($str, ' UTF8 '))/2;

For example, "Chinese a character 1" strlen ($STR) value is the 14,mb_strlen ($STR) value is 6, you can calculate the "Chinese a word 1" occupies the position is 10.

1 Echo mb_internal_encoding ();

PHP's built-in string length function strlen does not handle the Chinese string correctly, it gets just the number of bytes in the string. For GB2312 Chinese encoding, strlen gets twice times the number of Chinese characters, and for UTF-8 encoded Chinese, it is 3 times times the difference (in UTF-8 code, a Chinese character occupies 3 bytes).

Using the Mb_strlen function can solve this problem better. The use of Mb_strlen is similar to strlen, except that it has a second optional parameter for specifying the character encoding. For example, to get the UTF-8 string $str length, you can use Mb_strlen ($str, ' UTF-8 '). If the second argument is omitted, PHP's internal encoding is used. The internal code can be obtained by the mb_internal_encoding () function.

It's important to note that Mb_strlen is not a PHP core function, and you need to make sure that the "Extension=php_mbstring.dll" line exists and is not commented out, before you use it, to ensure that the Php_mbstring.dll is loaded in php.ini. Otherwise, there is an issue with undefined functions.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.