1.字串長度
PHP擷取中英文混合字串長度的實現代碼如下,1中文=1位,2英文=1位,可自行修改
/**<br />* PHP擷取字串中英文混合長度<br />* @param $str string 字串<br />* @param $$charset string 編碼<br />* @return 返回長度,1中文=1位,2英文=1位<br />*/<br />function strLength($str,$charset='utf-8'){<br />if($charset=='utf-8') $str = iconv('utf-8','gb2312',$str);<br />$num = strlen($str);<br />$cnNum = 0;<br />for($i=0;$i<$num;$i++){<br />if(ord(substr($str,$i+1,1))>127){<br />$cnNum++;<br />$i++;<br />}<br />}<br />$enNum = $num-($cnNum*2);<br />$number = ($enNum/2)+$cnNum;<br />return ceil($number);<br />}</p><p>//測試輸出長度都為15<br />$str1 = '測試測試測試測試測試測試測試測';<br />$str2 = 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa';<br />$str3 = 'aa測試aa測試aa測試aa測試aaaaaa';<br />echo strLength($str1,'gb2312');<br />echo strLength($str2,'gb2312');<br />echo strLength($str3,'gb2312');</p><p>
2.截取字串函數
UTF8編碼,在UTF8中,一個中文字元佔3個位元組
function msubstr($str, $start, $len) {<br />$tmpstr = "";<br />$strlen = $start + $len;<br />for($i = 0; $i < $strlen; $i++){<br />if(ord(substr($str, $i, 1)) > 127){<br />$tmpstr.=substr($str, $i, 3);<br />$i+=2;<br />}else<br />$tmpstr.= substr($str, $i, 1);<br />}<br />return $tmpstr;<br />}<br />echo msubstr("一二三天下致公english",0,10);
GB2312編碼,在gb2312中,一個中文字元佔2個位元組
<?php<br />function msubstr($str, $start, $len) { //ȡ<br /> $tmpstr = "";<br /> $strlen = $start + $len;<br /> if(preg_match('/[/d/s]{2,}/',$str)){$strlen=$strlen-2;}<br /> for($i = 0; $i < $strlen; $i++) {<br /> if(ord(substr($str, $i, 1)) > 0xa0) {<br /> $tmpstr .= substr($str, $i, 2);<br /> $i++;<br /> } else<br /> $tmpstr .= substr($str, $i, 1);<br /> }<br /> return $tmpstr;<br /> }</p><p>?>
編碼相容性良好的函數
function cc_msubstr($str, $start=0, $length, $charset="utf-8", $suffix=true)<br />{<br />if(function_exists("mb_substr"))<br />return mb_substr($str, $start, $length, $charset);<br />elseif(function_exists('iconv_substr')) {<br />return iconv_substr($str,$start,$length,$charset);<br />}<br />$re['utf-8'] = "/[/x01-/x7f]|[/xc2-/xdf][/x80-/xbf]|[/xe0-/xef][/x80-/xbf]{2}|[/xf0-/xff][/x80-/xbf]{3}/";<br />$re['gb2312'] = "/[/x01-/x7f]|[/xb0-/xf7][/xa0-/xfe]/";<br />$re['gbk'] = "/[/x01-/x7f]|[/x81-/xfe][/x40-/xfe]/";<br />$re['big5'] = "/[/x01-/x7f]|[/x81-/xfe]([/x40-/x7e]|/xa1-/xfe])/";<br />preg_match_all($re[$charset], $str, $match);<br />$slice = join("",array_slice($match[0], $start, $length));<br />if($suffix) return $slice."…";<br />return $slice;<br />}