PHP intercept string Length (mixed string in Chinese and English) _php tutorial

Source: Internet
Author: User
Tags ord
This paper introduces the function of string interception from PHP's own interception function to the final support of Chinese, English and English mixed string interception methods, a friend of the need to refer to.

Takes a partial string.

Syntax: String substr (string string, int start, int [length]);

return value: String

Function type: Data processing

Content Description

This function takes the string starting at the start of the string to take a length character. If start is negative, it is counted from the end of the string. If the omitted parameter length exists, but is a negative number, it is taken to the penultimate length character.

Usage examples

The code is as follows Copy Code

echo substr ("abcdef", 1, 3); Returns "BCD"
echo substr ("ABCdef",-2); Back to "EF"
echo substr ("ABCdef",-3, 1); Return "D"
echo substr ("abcdef", 1,-1); Back to "BCDE"
?>

The above only supports English does not support Chinese


Intercept GB2312 Chinese string

The code is as follows Copy Code

< PHP
Intercept Chinese strings
function Mysubstr ($str, $start, $len) {
$tmpstr = "";
$strlen = $start + $len;
for ($i = 0; $i < $strlen; $i + +) {
if (Ord (substr ($str, $i, 1)) > 0xa0) {
$tmpstr. = substr ($str, $i, 2);
$i + +;
} else
$tmpstr. = substr ($str, $i, 1);
}
return $tmpstr;
}
?>

Intercepting UTF8 encoded multibyte strings

The code is as follows Copy Code

< PHP
Intercept UTF8 string
function Utf8substr ($str, $from, $len)
{
Return Preg_replace (' #^ (?: [x00-x7f]|[ xc0-xff][x80-xbf]+) {0, '. $from. '} '.
' (?: [x00-x7f]| [Xc0-xff] [x80-xbf]+) {0, '. $len. '}). * #s ',
' $ ', $str);
}
?>

/*
* Function: function as substr, except it will not cause garbled
Parameters
Returns
*/

The code is as follows Copy Code

function Utf8_substr ($str, $start, $length =null) {

The normal interception of the first time.
$res = substr ($str, $start, $length);
$strlen = strlen ($STR);

/* Go ahead and determine whether the 6 bytes are complete (not mutilated). */

If the argument start is a positive number
if ($start >= 0) {
Go ahead and intercept about 6 bytes.
$next _start = $start + $length; Initial position
$next _len = $next _start + 6 <= $strlen? 6: $strlen-$next _start;
$next _segm = substr ($str, $next _start, $next _len);

If the 1th byte is not the first byte of the full character, then intercept about 6 bytes
$prev _start = $start-6 > 0? $start-6:0;
$prev _segm = substr ($str, $prev _start, $start-$prev _start);
}
Start is a negative number
else{
Go ahead and intercept about 6 bytes.
$next _start = $strlen + $start + $length; Initial position
$next _len = $next _start + 6 <= $strlen? 6: $strlen-$next _start;
$next _segm = substr ($str, $next _start, $next _len);

If the 1th byte is not the first byte of the full character, then the next intercept is about 6 bytes.
$start = $strlen + $start;
$prev _start = $start-6 > 0? $start-6:0;
$prev _segm = substr ($str, $prev _start, $start-$prev _start);
}

Determine if the first 6 bytes conform to the UTF8 rule
if (Preg_match (' @^ ([x80-xbf]{0,5}) [xc0-xfd]?@ ', $next _segm, $bytes)) {
if (!empty ($bytes [1])) {
$bytes = $bytes [1];
$res. = $bytes;
}
}

Determine if 6 bytes are compliant with the UTF8 rule
$ord 0 = ord ($res [0]);
if (<= $ord 0 && 191 >= $ord 0) {
Intercept it and add it to the front of Res.
if (Preg_match (' @[xc0-xfd][x80-xbf]{0,5}$@ ', $prev _segm, $bytes)) {
if (!empty ($bytes [0])) {
$bytes = $bytes [0];
$res = $bytes. $res;
}
}
}

return $res;
}

Test data::

copy code
$str = ' dfjdjf 13f test 65&2 Data FD J (1 on Mfe& Just ';
Var_dump (Utf8_substr ($STR, +)); Echo '
';
Var_dump (Utf8_substr ($STR, 6)); Echo '
';
Var_dump (Utf8_substr ($STR, 9,)); Echo '
';
Var_dump (Utf8_substr ($STR, +)); Echo '
' ;
Var_dump (Utf8_substr ($STR, 6)); Echo '
';

Display Result::(interception without garbled, welcome to test, submit bug)
String (12) "according to FDJ"
String (26) "According to FDJ (1 on Mfe& ...")
String ("13f test 65&2 number")
String (12) "Data FD"
String "DJ (1 Mfe& ...")

To share what I used to do.

Now let's look at the Chinese truncation function.

The code is as follows Copy Code

function Moocutstr ($string, $length, $dot = ' ... ') {
Global $charset;

if (strlen ($string) <= $length) {
return $string;
}
$string = Str_replace (' & ', ' ' ', ' < ', ' > '), Array (' & ', ' "', ' < ', ' > '), $string);
$strcut = ";
if (Strtolower ($charset) = = ' Utf-8 ') {
$n = $tn = $noc = 0;
while ($n < strlen ($string)) {
$t = Ord ($string [$n]);
if ($t = = 9 | | $t = = 10 | | (<= $t && $t <= 126)) {
$tn = 1; $n + +; $noc + +;
} elseif (194 <= $t && $t <= 223) {
$tn = 2; $n + = 2; $noc + = 2;
} elseif (224 <= $t && $t < 239) {
$tn = 3; $n + = 3; $noc + = 2;
} elseif (<= $t && $t <= 247) {
$tn = 4; $n + = 4; $noc + = 2;
} elseif (248 <= $t && $t <= 251) {
$tn = 5; $n + = 5; $noc + = 2;
} elseif ($t = = 252 | | $t = = 253) {
$tn = 6; $n + = 6; $noc + = 2;
} else {
$n + +;
}
if ($noc >= $length) {
Break
}
}
if ($noc > $length) {
$n-= $tn;
}
$strcut = substr ($string, 0, $n);
} else {
for ($i = 0; $i < $length; $i + +) {
$strcut. = Ord ($string [$i]) > 127? $string [$i]. $string [+ + $i]: $string [$i];
}
}
$strcut = Str_replace (' & ', ' ' ', ' < ', ' > '), Array (' & ', ' "', ' < ', ' > '), $strcut);

return $strcut. $dot;
}

http://www.bkjia.com/PHPjc/631589.html www.bkjia.com true http://www.bkjia.com/PHPjc/631589.html techarticle This paper introduces the function of string interception from PHP's own interception function to the final support of Chinese, English and English mixed string interception methods, a friend of the need to refer to. Take part ...

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.