PHP in English string interception, support UTF8 and GBK
?
Today, encountered a Chinese and English string interception problem, in the GBK of each word accounted for two bytes, if all Chinese words, with the substr () function can be achieved, but the Chinese and English have some words on the trouble, in the previous collection of code found a good function, very good implementation of the interception function
?
function Get_word ($string, $length, $dot = ': ', $charset = ' GBK ') {if (strlen ($string) <= $length) {return $string; } $string = Str_replace (', ', ' & ', ' ' ', ' < ', ' > '), Array (' ', ' ', ' & ', ' "', ' < ', ' > '), $strin g); $strcut = "; if (Strtolower ($charset) = = ' Utf-8 ') {$n = $tn = $noc = 0; while ($n < strlen ($string)) {$t = Ord ($string [$n]); if ($t = = 9 | | $t = = 10 | | (<= $t && $t <= 126)) {$tn = 1; $n + +; $noc + +; } elseif (194 <= $t && $t <= 223) {$tn = 2; $n + = 2; $noc + = 2; } elseif (224 <= $t && $t < 239) {$tn = 3; $n + = 3; $noc + = 2; } elseif (<= $t && $t <= 247) {$tn = 4; $n + = 4; $noc + = 2; } elseif (248 <= $t && $t <= 251) {$tn = 5; $n + = 5; $noc + = 2; } elseif ($t = = 252 | | $t = = 253) { $tn = 6; $n + = 6; $noc + = 2; } else {$n + +; } if ($noc >= $length) {break; }} if ($noc > $length) {$n-= $tn; } $strcut = substr ($string, 0, $n); } else {for ($i = 0; $i < $length; $i + +) {$strcut. = Ord ($string [$i]) > 127? $string [$i]. $string [+ + $i]: $string [$i]; }} return $strcut. $dot;}
?