General Web page display will inevitably involve the interception of substring, this time truncate on the use of, but it is only suitable for English users, with Chinese users, using truncate will appear garbled, and for Chinese English mixed string, intercept the same number of strings, The actual display length is different, the visual will appear uneven, the image is beautiful. This is because the length of a Chinese is roughly equivalent to the length of two English. In addition, truncate can not be compatible with GB2312, UTF-8 and other encodings.
Improved smarttruncate: FileName: modifier.smartTruncate.php
Copy Code code as follows:
<?php
function SmartDetectUTF8 ($string)
{
static $result = Array ();
if (! array_key_exists ($key = MD5 ($string), $result))
{
$utf 8 = "
/^(?:
[\x09\x0a\x0d\x20-\x7e] # ASCII
| [\XC2-\XDF] [\X80-\XBF] # Non-overlong 2-byte
| \XE0[\XA0-\XBF][\X80-\XBF] # excluding overlongs
| [\xe1-\xec\xee\xef] [\X80-\XBF] {2} # straight 3-byte
| \XED[\X80-\X9F][\X80-\XBF] # excluding surrogates
| \XF0[\X90-\XBF][\X80-\XBF]{2} # Planes 1-3
| [\xf1-\xf3] [\X80-\XBF] {3} # planes 4-15
| \XF4[\X80-\X8F][\X80-\XBF]{2} # Plane 16
) +$/xs
";
$result [$key] = Preg_match (Trim ($utf 8), $string);
}
return $result [$key];
}
function Smartstrlen ($string)
{
$result = 0;
$number = SmartDetectUTF8 ($string)? 3:2;
for ($i = 0; $i < strlen ($string); $i + + $bytes)
{
$bytes = Ord (substr ($string, $i, 1)) > 127? $number: 1;
$result + + $bytes > 1? 1:0.5;
}
return $result;
}
function Smartsubstr ($string, $start, $length = null)
{
$result = ';
$number = SmartDetectUTF8 ($string)? 3:2;
if ($start < 0)
{
$start = Max (Smartstrlen ($string) + $start, 0);
}
for ($i = 0; $i < strlen ($string); $i + + $bytes)
{
if ($start <= 0)
{
Break
}
$bytes = Ord (substr ($string, $i, 1)) > 127? $number: 1;
$start-= $bytes > 1? 1:0.5;
}
if (Is_null ($length))
{
$result = substr ($string, $i);
}
Else
{
for ($j = $i; $j < strlen ($string); $j = = $bytes)
{
if ($length <= 0)
{
Break
}
if ($bytes = Ord (substr ($string, $j, 1)) > 127? $number: 1) > 1)
{
if ($length < 1.0)
{
Break
}
$result. = substr ($string, $j, $bytes);
$length-= 1.0;
}
Else
{
$result. = substr ($string, $j, 1);
$length-= 0.5;
}
}
}
return $result;
}
function Smarty_modifier_smarttruncate ($string, $length =, $etc = ' ... ',
$break _words = False, $middle = False)
{
if ($length = = 0)
Return ";
if (Smartstrlen ($string) > $length) {
$length-= Smartstrlen ($ETC);
if (! $break _words &&! $middle) {
$string = Preg_replace ('/\s+? \s+)? $/', ', Smartsubstr ($string, 0, $length + 1));
}
if (! $middle) {
Return Smartsubstr ($string, 0, $length). $etc;
} else {
Return Smartsubstr ($string, 0, $length/2). $etc. Smartsubstr ($string,-$length/2);
}
} else {
return $string;
}
}
?>
The above code fully implements the original function of truncate, and can be compatible with both GB2312 and UTF-8 encoding, in determining the length of characters, a Chinese character is 1.0, an English character is 0.5, so there will be no uneven situation when the substring is intercepted.
There's nothing special about the way plug-ins are used, but here's a simple test:
{$content |smarttruncate:5: "..."} ($content equals "a medium B C people D people e total F and G country H")
Show: a medium B wah c. (The length of the Chinese symbol is 1.0, the English symbol length is 0.5, and the length of the elliptical symbol is considered)
Whether you're using GB2312 encoding or UTF-8 encoding, you'll find that the results are correct, which is one of the reasons why I added smart to the plugin name.