PHP intercept equal length UFT8 Chinese and English mixed strings

Source: Internet
Author: User

Because of the need, want to implement "PHP interception, such as Long UFT8 in English mixed string", but the internet to find a lot of code is not garbled or can not achieve equal length (in a Chinese length, two English letters count a length, such as ' equal length ' length is 2, ' UTF8 ' length is 2).

Because UTF8 encoding, Chinese is three bytes, English is a byte, with substr will appear garbled, with MB_SUBSTR will appear above the unequal length problem, but will not have garbled;

I am operating in bytes, a simple implementation of a small program.
Only in UTF8 encoding is used.

PHP code
    /*UTF8 encoding to intercept equal length Chinese and English strings*/ //English punctuation [., \ "\\?!:_ ']<?functionSubstr_utf8 ($string,$start,$length)      {       //by Aiou         $chars=$string; //echo $string [0]. $string [1]. $string [2];          $i=0;  Do{              if(Preg_match("/[0-9a-za-z]/",$chars[$i])){//Pure English                $m++; }          Else{$n++; }//non-English bytes,            $k=$n/3+$m/2; $l=$n/3+$m;//The final intercept length; $l = $n/3+ $m             $i++; }  while($k<$length); $str 1= Mb_substr ($string,$start,$l, ' Utf-8 ');//ensure that no garbled characters are present         return $str 1; }  

Test results:

PHP code
    $string = ' first intercept, MB_SUBSTR returns the string width is calculated by ' word ';       $string 1 = ' first intercept, return the string width is calculated by ' word ';       $string 2 = ' A A D intercept, the 12345 returned is the string width is calculated by the word ';  



1.

PHP code
    EchoSubstr_utf8 ($string, 0, 1). ' <br/> '; EchoSubstr_utf8 ($string, 0,2). ' <br/> '; EchoSubstr_utf8 ($string, 0, 3). ' <br/> '; EchoSubstr_utf8 ($string, 0,4). ' <br/> '; EchoSubstr_utf8 ($string, 0,5). ' <br/> '; EchoSubstr_utf8 ($string, 0,6). ' <br/> '; EchoSubstr_utf8 ($string, 0,7). ' <br/> '; EchoSubstr_utf8 ($string, 0,8). ' <br/> '; EchoSubstr_utf8 ($string, 0,9). ' <br/> '; EchoSubstr_utf8 ($string, 0,10). ' <br/> '; EchoSubstr_utf8 ($string, 0,11). ' <br/> '; EchoSubstr_utf8 ($string, 0,12). ' <br/> '; EchoSubstr_utf8 ($string, 0,13). ' <br/> '; EchoSubstr_utf8 ($string, 0,14). ' <br/> '; EchoSubstr_utf8 ($string, 0,15). ' <br/> '; EchoSubstr_utf8 ($string, 0,16). ' <br/> '; EchoSubstr_utf8 ($string, 0,17). ' <br/> '; EchoSubstr_utf8 ($string, 0,18). ' <br/> '; EchoSubstr_utf8 ($string, 0,19). ' <br/> '; EchoSubstr_utf8 ($string, 0,20). ' <br/> ';


The
First
First time
First time cut
First time Intercept
First interception,
First time interception, MB
First Intercept, mb_s.
First Intercept, Mb_sub.
First Intercept, Mb_subst.
First Intercept, Mb_substr.
First interception, Mb_substr return
First Intercept, MB_SUBSTR return
First Intercept, MB_SUBSTR return.
For the first intercept, MB_SUBSTR returns
The first intercept, the MB_SUBSTR return is the word
First Intercept, MB_SUBSTR returns the character
The first intercept, MB_SUBSTR returns a string
The first intercept, MB_SUBSTR returns the string width
The first intercept, MB_SUBSTR returns the string width


2.

Java code
$ss = ' 1234567890abcdefghijklmnopqrst '; Echo Utf8helper::substr_utf8 ($ss,0,1). ' <br/> '; Echo Utf8helper::substr_utf8 ($ss,0,2). ' <br/> '; Echo Utf8helper::substr_utf8 ($ss,0,3). ' <br/> '; Echo Utf8helper::substr_utf8 ($ss,0,4). ' <br/> '; Echo Utf8helper::substr_utf8 ($ss,0,5). ' <br/> '; Echo Utf8helper::substr_utf8 ($ss,0,6). ' <br/> '; Echo Utf8helper::substr_utf8 ($ss,0,7). ' <br/> '; Echo Utf8helper::substr_utf8 ($ss,0,8). ' <br/> '; Echo Utf8helper::substr_utf8 ($ss,0,9). ' <br/> '; Echo Utf8helper::substr_utf8 ($ss,0,10);

12
1234
123456
12345678
1234567890
1234567890ab
1234567890abcd
1234567890abcdef
1234567890abcdefgh
1234567890abcdefghij

The length is based on the number of Chinese characters.
Basically every two English letters, numbers, English punctuation count as a Chinese character length. It seems to be a good effect.
Improvement can also be done under other coding.
Efficiency did not test, there is no such concept.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.