- $ Str = 'Hello world! ';
- Echo strlen ($ str); // output 12
- ?>
-
However, in PHP built-in functions, strlen and mb_strlen both calculate the length by calculating the number of bytes occupied by the string. the number of bytes occupied by Chinese characters varies with encoding. In GBK/GB2312, the Chinese character occupies 2 bytes, while in the UTF-8, the Chinese character occupies 3 bytes.
- $ Str = 'Hello, World! ';
- Echo strlen ($ str); // output 12 under GBK or GB2312, output 18 under UTF-8
- ?>
-
While we often need to judge the length of the string is the number of characters, rather than the number of bytes occupied by the string, such as the php code under the UTF-8:
- $ Name = 'Zhang Geng Chang ';
- $ Len = strlen ($ name );
- // Output FALSE because three Chinese characters in the UTF-8 account for 9 bytes
- If ($ len> = 3 & $ len <= 8 ){
- Echo 'true ';
- } Else {
- Echo 'false ';
- }
- ?>
-
So what convenient and practical methods can be used to obtain the length of a string containing Chinese characters? We can use regular expressions to calculate the number of Chinese characters, divided by 2 in GBK/GB2312 encoding, divided by 3 in UTF-8 encoding, and finally added the length of non-Chinese characters, however, this is not too troublesome. WordPress has a more beautiful code. the reference is as follows:
- $ Str = 'Hello, World! ';
- Preg_match_all ('/./us', $ str, $ match );
- Echo count ($ match [0]); // output 9
- ?>
-
Use a regular expression to divide a string into a single character, and use count to calculate the number of matched characters. |