Question about GBK string truncation ~~~~ This is a function for intercepting GBK strings: PHPcodefunctiongb_substr ($ str, $ len) {$ count = 0; for ($ I = 0) question about intercepting GBK strings ~~~~
This is a function that intercepts GBK strings:
PHP code
function gb_substr( $str , $len ){ $count = 0; for ( $i =0; $i < strlen ( $str ); $i ++){ if ( $count == $len ) break ; if (preg_match( "/[\x80-\xff]/" , substr ( $str , $i , 1))){ ++ $i ; } ++ $count ; } return substr ( $str , 0, $i ); }
My question is, is each GBK and GB2312 not necessarily two bytes? Is the length of X 2 enough to be intercepted?
For example, I want to take 3 characters: 3*2 = 6, that is, I want to take 6 characters.
Is that true?
------ Solution --------------------
Using functions of the mb series, GBK pairs are non-ASCII characters, and ASCII is still 1 byte.
------ Solution --------------------
No, because it may contain ascii numbers or characters.
Discussion
This is a function that intercepts GBK strings:
PHP code
Function gb_substr ($ str, $ len ){
$ Count = 0;
For ($ I = 0; $ I <strlen ($ str); $ I ++ ){
If ($ count = $ len) break ......