In the learning Php&mysql-character encoding (i), the conversion relationship between Unicode and UTF-8 is introduced, a UTF-8 coding rule is summed up, according to this coding rule, write a UTF-8 coding parser, the following is the implementation of PHP:
Copy Code code as follows:
<?php
/*
program feature, $str is a mixed UTF-8 encoded string in both English and Chinese,
This string is correct according to the UTF-8 encoding rule Decode and display.
/
$str = ' Today is very happy, all decided to go to KFC to eat cola chicken wings!!! ';
/*
$str is the string to be intercepted
$len is the number of characters that are intercepted
/
function utf8sub ($str, $len) {
if ($len <= 0) {
return "; E ($chars < $len) {
//Take first byte of string
//convert it to decimal
//convert to binary
$high = Ord (substr ($str, $offset, 1));
//echo ' $high = '. $high. ' <br/> ';
if ($high = = null) {//If the high position is NULL, the proof has been taken to the end, and the direct break
break;
}
if ($high >>2) = = 0x3F) {//move high to the right 2-bit, and binary 111111 to 6 bytes
//Intercept 2 bytes
$count = 6;
}else if ($high >>3) = = 0x1F) {//move the high position 2 digits to the right, and binary 11111 to 5 bytes
//Intercept 3 bytes
$count = 5;
}else if ($high >>4) = = 0xF) {//move high to the right 2-bit, and binary 1111 to 4 bytes
//intercepts 4 bytes
$count = 4;
}else if (($high >>5) = = 0x7) {//move high to the right 2-bit, and binary 111 to 3 bytes
//Intercept 5 bytes
$count = 3;
}else if ($high >>6) = = 0x3) {//move the high position 2 digits to the right, and binary 11 to 2 bytes
//Intercept 6 bytes
$count = 2;
}else if ($high >>7) = = 0x0) {//move high to the right 2-bit, and binary 0 to 1 bytes
$count = 1;
}
//echo ' $count = '. $count. ' <br/> ';
$res. = substr ($str, $offset, $count);//Remove a character and $res string connection
$chars + = 1;//intercepted characters +1
$offset + = $count ; Intercepts the high offset to move backward $count
}
return $res;
}
Echo utf8sub ($STR, 100);