The phpmb_strwidth function intercepts mixed Chinese and English strings.

Source: Internet
Author: User
If it is a simple string truncation, we usually use the substr function or use mb_substr to intercept the characters. However, sometimes we find that it is not as simple as intercepting a mix of Chinese and English strings, in this case, we need to consider encoding problems, such as ascii, 16... if it is a simple string truncation, we usually use the substr function or use mb_substr to intercept the characters. However, sometimes we find that it is not as simple as intercepting a mix of Chinese and English strings, in this case, we need to consider encoding problems, such as ascii, hexadecimal, regular expression matching, and cyclic counting. However, we do not need to think of these functions that are not commonly used in our article, mb_strwidth is a built-in function of php.

Mb_strwidth ($ str, $ encoding) returns the string width.

$ Str string to be calculated

$ Encoding used by encoding, such as utf8 and gbk

Mb_strimwidth ($ str, $ start, $ width, $ tail, $ encoding) truncates a string by its width.

$ Str string to be truncated

$ Start: where to start the truncation. the default value is 0.

$ Width: the width to be truncated.

$ Tail appends the string to the string behind the truncated string, which is commonly used...

$ Encoding used by encoding

The instance code is as follows:

 10) {// this parameter is set to start from 0 and take 10 append ..., use utf8 encoding // note the append... it will also be calculated to the length of $ str = mb_strimwidth ($ str, 0, 10 ,'... ', 'utf8');} // Finally output aaaa... 4 a, 4, 1, 2, 3, 3, 4, 2, 3, 3, 4, 2, 3, = 9 // Is it easy, some people have said why 9, not 10? // Because "ah" is followed by "ah", two Chinese characters are counted. 9 + 2 = 11 exceeds the set value. Therefore, if one character is removed, the echo $ str value is 9;

If there is no problem for all Chinese characters, but there is a problem if there is a symbol in the middle, such as when I use mb_strimwidth, mb_strwidth, and then find that if there is a "" symbol in the title, PHP mb_strwidth considers this symbol as one width. I wonder if this is not a Chinese double quotation mark. it should be a byte in width, and the length should be two widths, after the query "" unicode respectively u201C and u201D, not in the range of Chinese characters, and then query the unicode.org code table, found that u2000-u206F is the range of universal symbols, although the characters in this range are all in the form of wide characters, the mb _ function of PHP considers it to be 1 width, and there is no way to rely on itself. the code is as follows:

Function truncString ($ str, $ length) {$ countLen = 0; for ($ I = 0; $ I <mb_strlen ($ str); $ I ++) {$ countLen + = amb_strwidth (mb_substr ($ str, $ I, 1); if ($ countLen> $ length) return mb_substr ($ str, 0, $ I );} return $ str;} function amb_strwidth ($ str_width) {$ count = 0; for ($ I = 0; $ I <mb_strlen ($ str_width); $ I ++) {// if (mb_substr ($ str_width, $ I, 1) = "\ xE2 \ x80 \ x9C" | mb_substr ($ str_width, $ I, 1) = '\ xE2 \ x80 \ x9d ') // if characters in the u2000-u206F are encountered, the counter is added with 2 if (preg_match ("/[\ x {2000}-\ x {206F}]/u ", mb_substr ($ str_width, $ I, 1) // open source code phprm.com $ count + = 2; else $ count + = mb_strwidth (mb_substr ($ str_width, $ I, 1);} return $ count ;}

Summary: How can I feel that this is the back-to-origin? I still need to use the cyclic Review character encoding to retrieve the number of characters.


Address:

Reprinted at will, but please attach the article address :-)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.