However, when English and Chinese characters are mixed, the following problems may occur:
If there is such a string
$ Str = "this is a string ";
To intercept the first 10 characters of the string, use
If (strlen ($ str)> 10) $ str = substr ($ str, 10 )."... ";
The output of echo $ str should be "this is a word... "
Hypothesis
$ Str = "this is a string ";
This string contains a half-width character and is also executed:
If (strlen ($ str)> 10) $ str = substr ($ str, 10 );
The 10th and 11 characters of the original string $ str constitute the Chinese character "character ";
After the string is split, the Chinese character is split into two parts, so that the intercepted string will be garbled.
How can this problem be solved? That is to say, if you want to split a long string, it cannot be garbled?Copy codeThe Code is as follows: <? Php
// There are many in the village. This is gb2312.
Function substrs ($ content, $ length = '30 ')
{
If ($ length & strlen ($ content)> $ length)
{
$ Num = 0;
For ($ I = 0; $ I <$ length-3; $ I ++)
{
If (ord ($ content [$ I])> 127)
{
$ Num ++;
}
}
$ Num % 2 = 1? $ Content = substr ($ content, 0, $ length-4): $ content = substr ($ content, 0, $ length-3 );
}
Return $ content;
}
?>
Copy codeThe Code is as follows: function cutstr ($ string, $ length, $ dot = '...'){
$ Strcut = '';
For ($ I = 0; $ I <$ length-strlen ($ dot)-1; $ I ++ ){
$ Strcut. = ord ($ string [$ I]) & gt; 127? $ String [$ I]. $ string [++ $ I]: $ string [$ I];
}
Return $ strcut. $ dot;
}
Copy codeThe Code is as follows: function cutTitle ($ str, $ len, $ tail = ""){
$ Length = strlen ($ str );
$ Lentail = strlen ($ tail );
$ Result = "";
If ($ length> $ len ){
$ Len = $ len-$ lentail;
For ($ I = 0; $ I <$ len; $ I ++ ){
If (ord ($ str [$ I]) <127 ){
$ Result. = $ str [$ I];
} Else {
$ Result. = $ str [$ I];
++ $ I;
$ Result. = $ str [$ I];
}
}
$ Result = strlen ($ result)> $ len? Substr ($ result, 0,-2). $ tail: $ result. $ tail;
} Else {
$ Result = $ str;
}
Return $ result;
}
Below are some supplements:
1. truncate the GB2312 Chinese String
The Code is as follows:Copy codeThe Code is as follows: <? Php
// Truncate a Chinese String
Function mysubstr ($ str, $ start, $ len ){
$ Tmpstr = "";
$ Strlen = $ start + $ len;
For ($ I = 0; $ I <$ strlen; $ I ++ ){
If (ord (substr ($ str, $ I, 1)> 0xa0 ){
$ Tmpstr. = substr ($ str, $ I, 2 );
$ I ++;
} Else
$ Tmpstr. = substr ($ str, $ I, 1 );
}
Return $ tmpstr;
}
?>
2. truncate UTF-8 encoded multi-byte strings
The Code is as follows:Copy codeThe Code is as follows: <? Php
// Truncate the utf8 string
Function utf8Substr ($ str, $ from, $ len)
{
Return preg_replace ('# ^ (? : [\ X00-\ x7F] | [\ xC0-\ xFF] [\ x80-\ xBF] +) {0, '. $ from .'}'.
'((? : [\ X00-\ x7F] | [\ xC0-\ xFF] [\ x80-\ xBF] +) {0 ,'. $ len. '}). * # s ',
'$ 1', $ str );
}
?>
3. UTF-8, GB2312 support Chinese Character truncation Function
The Code is as follows:Copy codeThe Code is as follows: <? Php
/*
Chinese character truncation functions supported by Utf-8 and gb2312
Cut_str (string, truncation length, start length, encoding );
The default encoding format is UTF-8.
The default start length is 0.
*/Function cut_str ($ string, $ sublen, $ start = 0, $ code = 'utf-8 ')
{
If ($ code = 'utf-8 ')
{
$ Pa = "/[\ x01-\ x7f] | [\ xc2-\ xdf] [\ x80-\ xbf] | \ xe0 [\ xa0-\ xbf] [\ x80 -\ xbf] | [\ xe1-\ xef] [\ x80-\ xbf] [\ x80-\ xbf] | \ xf0 [\ x90-\ xbf] [\ x80- \ xbf] [\ x80-\ xbf] | [\ xf1-\ xf7] [\ x80-\ xbf] [\ x80-\ xbf] [\ x80-\ xbf]/ ";
Preg_match_all ($ pa, $ string, $ t_string); if (count ($ t_string [0])-$ start> $ sublen) return join ('', array_slice ($ t_string [0], $ start, $ sublen )). "... ";
Return join ('', array_slice ($ t_string [0], $ start, $ sublen ));
}
Else
{
$ Start = $ start * 2;
$ Sublen = $ sublen * 2;
$ Strlen = strlen ($ string );
$ Tmpstr = ''; for ($ I = 0; $ I <$ strlen; $ I ++)
{
If ($ I >=$ start & $ I <($ start + $ sublen ))
{
If (ord (substr ($ string, $ I, 1)> 129)
{
$ Tmpstr. = substr ($ string, $ I, 2 );
}
Else
{
$ Tmpstr. = substr ($ string, $ I, 1 );
}
}
If (ord (substr ($ string, $ I, 1)> 129) $ I ++;
}
If (strlen ($ tmpstr) <$ strlen) $ tmpstr. = "...";
Return $ tmpstr;
}
} $ Str = "the string to be intercepted by abcd ";
Echo cut_str ($ str, 8, 0, 'gb2312 ');
?>
4. BugFree character truncation Function
The Code is as follows:Copy codeThe Code is as follows: <? Php
/**
* @ Package BugFree
* @ Version $ Id: FunctionsMain. inc. php, v 1.32 11:38:37 wwccss Exp $
*
*
* Return part of a string (Enhance the function substr ())
*
* @ Author Chunsheng Wang
* @ Param string $ String the string to cut.
* @ Param int $ Length the length of returned string.
* @ Param booble $ Append whether append "...": false | true
* @ Return string the cutted string.
*/
Function sysSubStr ($ String, $ Length, $ Append = false)
{
If (strlen ($ String) <= $ Length)
{
Return $ String;
}
Else
{
$ I = 0;
While ($ I <$ Length)
{
$ StringTMP = substr ($ String, $ I, 1 );
If (ord ($ StringTMP) >=224)
{
$ StringTMP = substr ($ String, $ I, 3 );
$ I = $ I + 3;
}
Elseif (ord ($ StringTMP) >=192)
{
$ StringTMP = substr ($ String, $ I, 2 );
$ I = $ I + 2;
}
Else
{
$ I = $ I + 1;
}
$ StringLast [] = $ StringTMP;
}
$ StringLast = implode ("", $ StringLast );
If ($ Append)
{
$ StringLast. = "...";
}
Return $ StringLast;
}
} $ String = "www.baidu.com ";
$ Length = "18 ";
$ Append = false;
Echo sysSubStr ($ String, $ Length, $ Append );
?>