This paper introduces the function of string interception from PHP's own interception function to the final support of Chinese, English and English mixed string interception methods, a friend of the need to refer to.
Takes a partial string.
Syntax: String substr (string string, int start, int [length]);
return value: String
Function type: Data processing
Content Description
This function takes the string starting at the start of the string to take a length character. If start is negative, it is counted from the end of the string. If the omitted parameter length exists, but is a negative number, it is taken to the penultimate length character.
Usage examples
The code is as follows |
Copy Code |
echo substr ("abcdef", 1, 3); Returns "BCD" echo substr ("ABCdef",-2); Back to "EF" echo substr ("ABCdef",-3, 1); Return "D" echo substr ("abcdef", 1,-1); Back to "BCDE" ?> |
The above only supports English does not support Chinese
Intercept GB2312 Chinese string
The code is as follows |
Copy Code |
< PHP Intercept Chinese strings function Mysubstr ($str, $start, $len) { $tmpstr = ""; $strlen = $start + $len; for ($i = 0; $i < $strlen; $i + +) { if (Ord (substr ($str, $i, 1)) > 0xa0) { $tmpstr. = substr ($str, $i, 2); $i + +; } else $tmpstr. = substr ($str, $i, 1); } return $tmpstr; } ?> |
Intercepting UTF8 encoded multibyte strings
The code is as follows |
Copy Code |
< PHP Intercept UTF8 string function Utf8substr ($str, $from, $len) { Return Preg_replace (' #^ (?: [x00-x7f]|[ xc0-xff][x80-xbf]+) {0, '. $from. '} '. ' (?: [x00-x7f]| [Xc0-xff] [x80-xbf]+) {0, '. $len. '}). * #s ', ' $ ', $str); } ?> |
/*
* Function: function as substr, except it will not cause garbled
Parameters
Returns
*/
The code is as follows |
Copy Code |
function Utf8_substr ($str, $start, $length =null) {
The normal interception of the first time. $res = substr ($str, $start, $length); $strlen = strlen ($STR);
/* Go ahead and determine whether the 6 bytes are complete (not mutilated). */ If the argument start is a positive number if ($start >= 0) { Go ahead and intercept about 6 bytes. $next _start = $start + $length; Initial position $next _len = $next _start + 6 <= $strlen? 6: $strlen-$next _start; $next _segm = substr ($str, $next _start, $next _len); If the 1th byte is not the first byte of the full character, then intercept about 6 bytes $prev _start = $start-6 > 0? $start-6:0; $prev _segm = substr ($str, $prev _start, $start-$prev _start); } Start is a negative number else{ Go ahead and intercept about 6 bytes. $next _start = $strlen + $start + $length; Initial position $next _len = $next _start + 6 <= $strlen? 6: $strlen-$next _start; $next _segm = substr ($str, $next _start, $next _len);
If the 1th byte is not the first byte of the full character, then the next intercept is about 6 bytes. $start = $strlen + $start; $prev _start = $start-6 > 0? $start-6:0; $prev _segm = substr ($str, $prev _start, $start-$prev _start); } Determine if the first 6 bytes conform to the UTF8 rule if (Preg_match (' @^ ([x80-xbf]{0,5}) [xc0-xfd]?@ ', $next _segm, $bytes)) { if (!empty ($bytes [1])) { $bytes = $bytes [1]; $res. = $bytes; } } Determine if 6 bytes are compliant with the UTF8 rule $ord 0 = ord ($res [0]); if (<= $ord 0 && 191 >= $ord 0) { Intercept it and add it to the front of Res. if (Preg_match (' @[xc0-xfd][x80-xbf]{0,5}$@ ', $prev _segm, $bytes)) { if (!empty ($bytes [0])) { $bytes = $bytes [0]; $res = $bytes. $res; } } } return $res; } |
Test data::
|
copy code |
$str = ' dfjdjf 13f test 65&2 Data FD J (1 on Mfe& Just '; Var_dump (Utf8_substr ($STR, +)); Echo ' '; Var_dump (Utf8_substr ($STR, 6)); Echo ' '; Var_dump (Utf8_substr ($STR, 9,)); Echo ' '; Var_dump (Utf8_substr ($STR, +)); Echo ' ' ; Var_dump (Utf8_substr ($STR, 6)); Echo ' '; |
Display Result::(interception without garbled, welcome to test, submit bug)
String (12) "according to FDJ"
String (26) "According to FDJ (1 on Mfe& ...")
String ("13f test 65&2 number")
String (12) "Data FD"
String "DJ (1 Mfe& ...")
To share what I used to do.
Now let's look at the Chinese truncation function.
The code is as follows |
Copy Code |
function Moocutstr ($string, $length, $dot = ' ... ') { Global $charset; if (strlen ($string) <= $length) { return $string; } $string = Str_replace (' & ', ' ' ', ' < ', ' > '), Array (' & ', ' "', ' < ', ' > '), $string); $strcut = "; if (Strtolower ($charset) = = ' Utf-8 ') { $n = $tn = $noc = 0; while ($n < strlen ($string)) { $t = Ord ($string [$n]); if ($t = = 9 | | $t = = 10 | | (<= $t && $t <= 126)) { $tn = 1; $n + +; $noc + +; } elseif (194 <= $t && $t <= 223) { $tn = 2; $n + = 2; $noc + = 2; } elseif (224 <= $t && $t < 239) { $tn = 3; $n + = 3; $noc + = 2; } elseif (<= $t && $t <= 247) { $tn = 4; $n + = 4; $noc + = 2; } elseif (248 <= $t && $t <= 251) { $tn = 5; $n + = 5; $noc + = 2; } elseif ($t = = 252 | | $t = = 253) { $tn = 6; $n + = 6; $noc + = 2; } else { $n + +; } if ($noc >= $length) { Break } } if ($noc > $length) { $n-= $tn; } $strcut = substr ($string, 0, $n); } else { for ($i = 0; $i < $length; $i + +) { $strcut. = Ord ($string [$i]) > 127? $string [$i]. $string [+ + $i]: $string [$i]; } } $strcut = Str_replace (' & ', ' ' ', ' < ', ' > '), Array (' & ', ' "', ' < ', ' > '), $strcut); return $strcut. $dot; } |
http://www.bkjia.com/PHPjc/631589.html www.bkjia.com true http://www.bkjia.com/PHPjc/631589.html techarticle This paper introduces the function of string interception from PHP's own interception function to the final support of Chinese, English and English mixed string interception methods, a friend of the need to refer to. Take part ...