PHP string mbstring specific method for processing Chinese strings parsing _ PHP Tutorial

Source: Internet
Author: User
PHP string mbstring parses the specific method for processing Chinese strings. The coexistence of multiple languages means multi-byte. the built-in string length function strlen in PHP cannot correctly process Chinese strings, and only obtains the number of bytes occupied by strings. For everyoneMulti-language coexistence means multi-byte. the built-in string length function strlen in PHP cannot correctly process Chinese strings, and only obtains the number of bytes occupied by strings. For GB2312 Chinese encoding, the strlen value is twice the number of Chinese characters, and for the UTF-8 encoding of Chinese, is 1 ~ 3 times the difference.

Using the PHP string mbstring can better solve this problem. The usage of mb_strlen is similar to that of strlen, except that it has a second optional parameter for specifying character encoding. For example, to get the length of the string $ str for the UTF-8, you can use mb_strlen ($ str, 'utf-8 ′). If the second parameter is omitted, the internal code of PHP is used. The internal encoding can be obtained through the mb_internal_encoding () function. There are two ways to set the internal encoding:

1. set mbstring. internal_encoding = UTF-8 in php. ini
2. call mb_internal_encoding ("GBK ")

In addition to the PHP string mbstring, there are many cutting functions, in which mb_substr is used to split characters by words, while mb_strcut is used to split characters by bytes, but no half character is generated. In addition, function cutting has different effects on the length. The Cut condition of mb_strcut is smaller than strlen, and that of mb_substr is equal to strlen. See the example below,

 
 
  1. <?
  2. $ Str = 'I am a long string of Chinese characters -www.jefflei.com ';
  3. Echo "mb_substr:". mb_substr ($ str, 0, 6, 'utf-8 ′);
  4. Echo"
  5. ";
  6. Echo "mb_strcut:". mb_strcut ($ str, 0, 6, 'utf-8 ′);
  7. ?>

The output is as follows:
Mb_substr: I am a comparison string
Mb_strcut: I am

Note that the PHP string mbstring is not the core function of PHP. before using the function, make sure that the mbstring support is added to the php compilation module:
(1) use-enable-mbstring during compilation
(2) modify/usr/local/lib/php. inc
Default_charset = "zh-cn"
Mbstring. language = zh-cn
Mbstring. internal_encoding = zh-cn

The PHP string mbstring class library contains a lot of content. It also includes e-mail processing functions such as mb _ send _ mail.


The coexistence of distinct languages means multiple bytes. the built-in string length function strlen in PHP cannot correctly process Chinese strings. all it produces is the number of bytes occupied by strings. Right...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.