String truncation is a very common programming task, which is often used for string truncation with Chinese characters. Although it is not difficult, it takes time to write a function. here we will introduce a good string truncation function to meet basic requirements.
The code is as follows:
Function sysSubStr ($ string, $ length, $ append = false)
{
If (strlen ($ string) <= $ length)
{
Return $ string;
}
Else
{
$ I = 0;
While ($ I <$ length)
{
$ StringTMP = substr ($ string, $ I, 1 );
If (ord ($ stringTMP) >=224)
{
$ StringTMP = substr ($ string, $ I, 3 );
$ I = $ I + 3;
}
Elseif (ord ($ stringTMP) >=192)
{
$ StringTMP = substr ($ string, $ I, 2 );
$ I = $ I + 2;
}
Else
{
$ I = $ I + 1;
}
$ StringLast [] = $ stringTMP;
}
$ StringLast = implode ("", $ stringLast );
If ($ append)
{
$ StringLast. = "...";
}
Return $ stringLast;
}
}
$ String = "concise modern magic-focus on mainstream Internet technologies ";
$ Length = "27 ";
$ Append = true;
Echo sysSubStr ($ string, $ length, $ append );
// Output
// Concise modern magic-tutorial...
?>
Intercept the GB2312 Chinese string:
The code is as follows:
// Truncate a Chinese string
Function mysubstr ($ str, $ start, $ len ){
$ Tmpstr = "";
$ Strlen = $ start + $ len;
For ($ I = 0; $ I <$ strlen; $ I ++ ){
If (ord (substr ($ str, $ I, 1)> 0xa0 ){
$ Tmpstr. = substr ($ str, $ I, 2 );
$ I ++;
} Else
$ Tmpstr. = substr ($ str, $ I, 1 );
}
Return $ tmpstr;
}
?>
Truncate the UTF-8 encoded multi-byte string:
The code is as follows:
// Truncate the utf8 string
Function utf8Substr ($ str, $ from, $ len)
{
Return preg_replace ('# ^ (? : [\ X00-\ x7F] | [\ xC0-\ xFF] [\ x80-\ xBF] +) {0, '. $ from .'}'.
'((? : [\ X00-\ x7F] | [\ xC0-\ xFF] [\ x80-\ xBF] +) {0 ,'. $ len. '}). * # s ',
'$ 1', $ str );
}
?>
UTF-8, GB2312 support Chinese character truncation function:
The code is as follows:
/*
Chinese character truncation functions supported by Utf-8 and gb2312
Cut_str (string, truncation length, start length, encoding );
The default encoding format is UTF-8.
The default start length is 0.
*/
Function cut_str ($ string, $ sublen, $ start = 0, $ code = 'utf-8 ')
{
If ($ code = 'utf-8 ')
{
$ Pa = "/[\ x01-\ x7f] | [\ xc2-\ xdf] [\ x80-\ xbf] | \ xe0 [\ xa0-\ xbf] [\ x80 -\ xbf] | [\ xe1-\ xef] [\ x80-\ xbf] [\ x80-\ xbf] | \ xf0 [\ x90-\ xbf] [\ x80- \ xbf] [\ x80-\ xbf] | [\ xf1-\ xf7] [\ x80-\ xbf] [\ x80-\ xbf] [\ x80-\ xbf]/ ";
Preg_match_all ($ pa, $ string, $ t_string );
If (count ($ t_string [0])-$ start> $ sublen) return join ('', array_slice ($ t_string [0], $ start, $ sublen )). "... ";
Return join ('', array_slice ($ t_string [0], $ start, $ sublen ));
}
Else
{
$ Start = $ start * 2;
$ Sublen = $ sublen * 2;
$ Strlen = strlen ($ string );
$ Tmpstr = '';
For ($ I = 0; $ I <$ strlen; $ I ++)
{
If ($ I >=$ start & $ I <($ start + $ sublen ))
{
If (ord (substr ($ string, $ I, 1)> 129)
{
$ Tmpstr. = substr ($ string, $ I, 2 );
}
Else
{
$ Tmpstr. = substr ($ string, $ I, 1 );
}
}
If (ord (substr ($ string, $ I, 1)> 129) $ I ++;
}
If (strlen ($ tmpstr) <$ strlen) $ tmpstr. = "...";
Return $ tmpstr;
}
}
$ Str = "the string to be intercepted by abcd ";
Echo cut_str ($ str, 8, 0, 'gb2312 ');
?>