PHP mb_strwidth function implementation in Chinese and English mixed-row string interception

Source: Internet
Author: User
Tags mixed strlen

Mb_strwidth ($STR, $encoding) returns the width of the string

$str the string to be evaluated

$encoding the encoding to use, such as UTF8, GBK

Mb_strimwidth ($str, $start, $width, $tail, $encoding) intercept strings by width

$str the string to intercept

$start where to start intercepting, the default is 0

$width the width to intercept

$tail append to the string behind the intercept string, the common ...

$encoding the encoding to use

The code is as follows Copy Code


. PHP
/**
 * UTF8 encoded format
 * 1 Chinese 3 bytes
 * We want 1 Chinese to occupy 2 bytes,
 * Because the position of 2 letters from the width is equivalent to 1 Chinese
 */

//test string
$str = ' aaaa ah aaaa ah ah aaa ';
Echo strlen ($STR); Output is only strlen to 25 bytes

///must specify encoding, or it will use PHP's inner Code mb_internal_encoding () to see the inner code
//////Use Mb_strwidth output string width of 20 using UTF8 encoding
Echo mb_strwidth ($str, ' utf8 ');

//Only UTF8 width greater than 10
if (Mb_strwidth ($str, ' ') >10) {
   //Here set to intercept from 0, take 10 append ..., using UTF8 encoding
   //Note the additional ... will also be computed to a length of
    $str = mb_strimwidth ($str, 0, ' ... ', ' UTF8 ');

//FINAL output aaaa ah ... 4 A, 4, 1, 2 3, 3 4+2+3=9
//is not very simple ah, some people say why 9 is not 10?
//Because the right "ah" behind or "ah", Chinese 2, 9+2=11 exceeded the set, so remove one is 9
Echo $str;

If there is no problem with all Chinese, but if there is a sign in the middle of the problem, such as I use Mb_strimwidth,mb_strwidth, then found that if there is a "" symbol in the title, PHP Mb_strwidth will think that the symbol is 1 width, I wonder if this is not in Chinese double quotes, logically must be a wide-byte, length should be 2 widths, after the query "" Unicode is u201c and u201d, not in the range of characters, and then query the unicode.org of the Code table, Found that u2000-u206f is a universal symbol range, although the characters in this range is a wide-character form, but the PHP mb_ function is considered to be 1 widths, no way, can only rely on their own.

The code is as follows Copy Code

function Truncstring ($str, $length)  

$countLen =0; 
for ($i =0; $i <mb_strlen ($STR); $i + +)  

$countLen +=amb_strwidth (Mb_substr ($str, $i, 1)); 
if ($countLen > $length)  
Return Mb_substr ($str, 0, $i); 

return $str; 

Function amb_strwidth ($str _widt h)  

$count =0; 
for ($i =0; $i <mb_strlen ($str _width); $i + +)  

//if ( Mb_substr ($str _width, $i, 1) = = "\xe2\x80\x9c" | | Mb_substr ($str _width, $i, 1) = = ' \xe2\x80\x9d ')  
//If you encounter characters within u2000-u206f, add the counter 2 
if (Preg_match ("/[\x {2000}-\x{206f}]/u ", mb_substr ($str _width, $i, 1)))  
$count +=2; 
else 
$count +=mb_ Strwidth (Mb_substr ($str _width, $i, 1)); 

return $count; 
}

Summary, do to make how to feel this becomes a back to the origin of the point, the feeling or to use the loop traversal calculation character encoding to take the number of digits ha.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.