Php supports UTF-8 and gbk? Today, I encountered a problem of intercepting Chinese and English strings. in gbk, each word occupies two bytes in Chinese. if it is all Chinese, use the substr () function, however, if both Chinese and English are available, it will be troublesome. I found a good function in the previous code favorites, which is a good way to implement the interception function? Supports UTF-8 and gbk.
?
Today, I encountered a problem of intercepting Chinese and English strings. in gbk, each word occupies two bytes in Chinese. if it is all Chinese, use the substr () function, however, if both Chinese and English are available, it will be troublesome. I found a good function in my previous favorite code, which can effectively implement the interception function.
?
function get_word($string, $length, $dot = '..',$charset='gbk') { if(strlen($string) <= $length) { return $string; } $string = str_replace(array(' ',' ', '&', '"', '<', '>'), array('','','&', '"', '<', '>'), $string); $strcut = ''; if(strtolower($charset) == 'utf-8') { $n = $tn = $noc = 0; while($n < strlen($string)) { $t = ord($string[$n]); if($t == 9 || $t == 10 || (32 <= $t && $t <= 126)) { $tn = 1; $n++; $noc++; } elseif(194 <= $t && $t <= 223) { $tn = 2; $n += 2; $noc += 2; } elseif(224 <= $t && $t < 239) { $tn = 3; $n += 3; $noc += 2; } elseif(240 <= $t && $t <= 247) { $tn = 4; $n += 4; $noc += 2; } elseif(248 <= $t && $t <= 251) { $tn = 5; $n += 5; $noc += 2; } elseif($t == 252 || $t == 253) { $tn = 6; $n += 6; $noc += 2; } else { $n++; } if($noc >= $length) { break; } } if($noc > $length) { $n -= $tn; } $strcut = substr($string, 0, $n); } else { for($i = 0; $i < $length; $i++) { $strcut .= ord($string[$i]) > 127 ? $string[$i].$string[++$i] : $string[$i]; } } return $strcut.$dot;}?