The Chinese character string for the UTF-8 is three bytes
Copy codeThe Code is as follows: <? Php
// Coding UTF-8
Echo strlen ('test text a test text ');
Echo '-';
Echo mb_strlen ('test text a test text', 'utf-8 ');
?>
Output: 25-9
The Chinese string of GB2312 is two bytes.
Copy codeThe Code is as follows: <? Php
// Code GB2312
Echo strlen ('test text a test text ');
Echo '-';
Echo mb_strlen ('test text a test text', 'gb2312 ');
?>
Output: 17-9
In Mysql databases (Versions later than Mysql 5.1), if the field type is varchar (10), you can insert 10 characters (not bytes );
Therefore, the length of a string must be determined by document encoding.
Character string Truncation in a simple UTF-8 (by number of characters)Copy codeThe Code is as follows: <?
/*
* UTF-8 string Truncation
* $ Str string to be intercepted
* $ Start: the starting position of the truncation.
* $ Length: truncation length
*/
Function cutStr ($ str, $ start, $ length ){
$ Restr = '';
$ J = 0;
$ End = $ length + $ start-1;
$ Plen = strlen ($ str );
For ($ I = 0; $ I <$ plen; $ I ++ ){
$ Restr. = ord ($ str [$ I])> 127? $ Str [$ I]. $ str [++ $ I]. $ str [++ $ I]: $ str [$ I];
$ J ++;
If ($ j <$ start) {$ restr = '';}
If ($ j >=$ end) {break ;}
}
$ Restr. = '';
Return $ restr;
}
$ Str = 'xinnet, September 11, September 24 (G20) leaders will hold the third financial summit in Pittsburgh, USA Today. ';
Echo $ str;
Echo '<br> ';
Echo utf8_substr ($ str, 0, 25 );
Echo '<br> ';
?>