Strlen ()
PHP strlen () function
Definitions and usage
The strlen () function returns the length of the string.
Grammar
Strlen (String)
Parameters: String
Description: Required. Specify the string to check.
The code is as follows |
Copy Code |
<?php $str = ' Chinese a word 1 characters '; echo strlen ($STR); echo ' <br/> '; Echo Mb_strlen ($str, ' UTF8 '); Output results 14 6 ?>
|
Result analysis: In strlen calculation, the Chinese character of a UTF8 is 3 lengths, so the length of "a character 1" is 3*4+2=14
When Mb_strlen is computed, the selected inner code is UTF8, and a Chinese character is computed as a length, so the length of the "1" character is 6
Mb_strlen () function
It's important to note that Mb_strlen is not a PHP core function and you need to be sure to load the Php_mbstring.dll in php.ini before you use it to ensure
The "Extension=php_mbstring.dll" line exists and is not commented out, otherwise there is an issue with undefined functions.
The code is as follows |
Copy Code |
<?php $str = ' Chinese a word 1 characters '; Calculated as follows Echo (strlen ($STR) + Mb_strlen ($str, ' UTF8 '))/2; Echo Output results 10 ?> |
The strlen ($STR) value of "Chinese a character 1" is the 14,mb_strlen ($STR) value is 6, you can calculate the "Chinese a character 1 character" occupies the position is 10.
Explain the difference between the two
The code is as follows |
Copy Code |
<?php
How the file is encoded when tested if UTF8 $str = ' Chinese a word 1 characters '; echo strlen ($str). ' <br> ';//14 Echo Mb_strlen ($str, ' UTF8 '). ' <br> ';//6 Echo Mb_strlen ($str, ' GBK '). ' <br> ';//8 Echo Mb_strlen ($str, ' gb2312 '). ' <br> ';//10 ?> |
Result analysis: In strlen calculation, the Chinese character of a UTF8 is 3 lengths, so the length of "A word 1" is 3*4+2=14 in Mb_strlen
When calculated, the selected inner code is UTF8, and a Chinese character is computed as a length, so the length of the "1" character is 6.
Although the above function can be simple to solve some of the mixed problems in Chinese and English, but can not be used in reality, I would like to introduce you to the other friends of the better to do
Method.
PHP to obtain in English mixed string length of the implementation code is as follows, 1 Chinese = 1 bits, 2 english = 1 bits, can be modified
The code is as follows |
Copy Code |
/** * php Get string in English mix length * @param $str string * @param $ $charset string encoding * @return return length, 1 Chinese = 1 bits, 2 english = 1 bits */ function Strlength ($str, $charset = ' utf-8 ') { if ($charset = = ' Utf-8 ') $str = Iconv (' utf-8 ', ' gb2312 ', $str); $num = strlen ($STR); $cnNum = 0; for ($i =0; $i < $num; $i + +) { if (Ord (substr ($str, $i +1,1)) >127) { $cnNum + +; $i + +; } } $enNum = $num-($cnNum *2); $number = ($enNum/2) + $cnNum; return Ceil ($number); } Test output length is 15 $str 1 = ' test test test test testing '; $str 2 = ' aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa '; $str 3 = ' AA Test AA Test AA test aaaaaa '; Echo Strlength ($str 1, ' gb2312 '); Echo Strlength ($str 2, ' gb2312 '); Echo Strlength ($str 3, ' gb2312 '); |
Intercepting String functions
UTF8 encoding, in UTF8, a Chinese character occupies 3 bytes
The code is as follows |
Copy Code |
function Msubstr ($str, $start, $len) { $tmpstr = ""; $strlen = $start + $len; for ($i = 0; $i < $strlen; $i + +) { if (Ord (substr ($str, $i, 1)) > 127) { $tmpstr. =substr ($str, $i, 3); $i +=2; }else $tmpstr. = substr ($str, $i, 1); } return $tmpstr; } echo msubstr ("123 Days to Public 中文版", 0,10); |
GB2312 encoding, in gb2312, a Chinese character occupies 2 bytes
code is as follows |
copy code |
<?php function Msubstr ($str, $start, $len) {//ȡ $tmpstr = ""; $strlen = $start + $len; if (Preg_match ('/[/d/s]{2,}/', $str)) {$strlen = $strlen-2;} for ($i = 0; $i < $strlen; $i + +) { if (Ord (substr ($str, $i, 1)) > 0xa0) { $tmpstr. = substr ($str, $i, 2); $i + +; } else $tmpstr. = substr ($str, $i, 1); } return $tmpstr; }
?> |
A function with good coding compatibility
code is as follows |
copy code |
function cc_ Msubstr ($str, $start =0, $length, $charset = "Utf-8", $suffix =true) { if (function_exists ("Mb_substr")) return mb_substr ($str, $start, $length, $charset); elseif (function_exists (' iconv_substr ')) { return iconv_substr ($str, $start, $length, $ CharSet); } $re [' utf-8 '] = '/[/x01-/x7f]| [/XC2-/XDF] [/x80-/xbf]| [/xe0-/xef] [/X80-/XBF] {2}| [/xf0-/xff] [/x80-/xbf]{3}/]; $re [' gb2312 '] = "/[/x01-/x7f]| [/xb0-/xf7] [/xa0-/xfe]/]; $re [' gbk '] = '/[/x01-/x7f]| [/x81-/xfe] [/x40-/xfe]/]; $re [' big5 '] = '/[/x01-/x7f]| [/x81-/xfe] ([/x40-/x7e]|/xa1-/xfe]) /"; preg_match_all ($re [$charset], $STR, $match); $slice = Join ("", Array_slice ($match [0], $start, $length)); if ($suffix) return $slice. " ..."; return $slice; } |