PHP correct parsing UTF-8 string skills application _ php Basics-php Manual

Source: Internet
Author: User
Tags php basics
According to this coding rule, write a UTF-8 code parsing program. below is the PHP implementation, A friend can refer to the "learning PHP & MYSQL -- character encoding (a)" introduced the conversion relationship between Unicode and UTF-8, summed up the coding rules of a UTF-8, according to this encoding rules, write a UTF-8 encoding parsing program, the following is the implementation of PHP:

The code is as follows:


/*
Program function, $ str is a UTF-8 encoded string mixed with Chinese and English,
Decodes and displays this string correctly according to the encoding rules of the UTF-8.
*/


$ Str = 'Today is very Happy. all decided to go to KFC to eat cool chicken wings !!! ';

/*
$ Str is the string to be truncated.
$ Len is the number of characters intercepted
*/
Function utf8sub ($ str, $ len ){
If ($ len <= 0 ){
Return '';
}

$ Offset = 0; // The offset when a high byte is intercepted.
$ Chars = 0; // Number of characters intercepted
$ Res = ''; // stores the intercepted result string

While ($ chars <$ len ){
// First take the first byte of the string
// Convert it to decimal
// Convert to binary
$ High = ord (substr ($ str, $ offset, 1 ));

// Echo '$ high ='. $ high .'
';

If ($ high = null) {// if the high position is null, it indicates that the result has been obtained to the end.
Break;
}
If ($ high> 2) ===0x3f) {// shifts the high position to the right two places. if it is the same as binary 111111, it takes 6 bytes.
// Intercept 2 bytes
$ Count = 6;
} Else if ($ high> 3) ===0x1f) {// shifts the high value to the right two places. if the value is the same as binary 11111, the value is 5 bytes.
// Truncate 3 bytes
$ Count = 5;
} Else if ($ high> 4) ===0xf) {// shifts the high value to the right two places, and compares it with binary 1111. if it is the same, it takes 4 bytes.

// Intercept 4 bytes
$ Count = 4;
} Else if ($ high> 5) === 0x7) {// shifts the high position to the right two places, which is compared with binary 111. if it is the same, it takes 3 bytes.

// Truncate 5 bytes
$ Count = 3;
} Else if ($ high> 6) === 0x3) {// shifts the high value to the right two places, which is equal to binary 11 and takes 2 bytes.
// Capture 6 bytes
$ Count = 2;
} Else if ($ high> 7) = 0x0) {// shifts the high value to the right by two places. if it is the same as binary 0, 1 byte is used.
$ Count = 1;
}
// Echo '$ count ='. $ count .'
';

$ Res. = substr ($ str, $ offset, $ count); // retrieves a character and connects it to the $ res string.
$ Chars + = 1; // Number of characters intercepted + 1
$ Offset + = $ count; // truncate the high offset and move it back to $ count bytes.
}
Return $ res;
}

Echo UTF-8 sub ($ str, 100 );

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.