PHP string mbstring parses the specific method for processing Chinese strings. The coexistence of multiple languages means multi-byte. the built-in string length function strlen in PHP cannot correctly process Chinese strings, and only obtains the number of bytes occupied by strings. For everyoneMulti-language coexistence means multi-byte. the built-in string length function strlen in PHP cannot correctly process Chinese strings, and only obtains the number of bytes occupied by strings. For GB2312 Chinese encoding, the strlen value is twice the number of Chinese characters, and for the UTF-8 encoding of Chinese, is 1 ~ 3 times the difference.
Using the PHP string mbstring can better solve this problem. The usage of mb_strlen is similar to that of strlen, except that it has a second optional parameter for specifying character encoding. For example, to get the length of the string $ str for the UTF-8, you can use mb_strlen ($ str, 'utf-8 ′). If the second parameter is omitted, the internal code of PHP is used. The internal encoding can be obtained through the mb_internal_encoding () function. There are two ways to set the internal encoding:
1. set mbstring. internal_encoding = UTF-8 in php. ini
2. call mb_internal_encoding ("GBK ")
In addition to the PHP string mbstring, there are many cutting functions, in which mb_substr is used to split characters by words, while mb_strcut is used to split characters by bytes, but no half character is generated. In addition, function cutting has different effects on the length. The Cut condition of mb_strcut is smaller than strlen, and that of mb_substr is equal to strlen. See the example below,
- <?
- $ Str = 'I am a long string of Chinese characters -www.jefflei.com ';
- Echo "mb_substr:". mb_substr ($ str, 0, 6, 'utf-8 ′);
- Echo"
- ";
- Echo "mb_strcut:". mb_strcut ($ str, 0, 6, 'utf-8 ′);
- ?>
-
The output is as follows:
Mb_substr: I am a comparison string
Mb_strcut: I am
Note that the PHP string mbstring is not the core function of PHP. before using the function, make sure that the mbstring support is added to the php compilation module:
(1) use-enable-mbstring during compilation
(2) modify/usr/local/lib/php. inc
Default_charset = "zh-cn"
Mbstring. language = zh-cn
Mbstring. internal_encoding = zh-cn
The PHP string mbstring class library contains a lot of content. It also includes e-mail processing functions such as mb _ send _ mail.
The coexistence of distinct languages means multiple bytes. the built-in string length function strlen in PHP cannot correctly process Chinese strings. all it produces is the number of bytes occupied by strings. Right...