PHP detection string is a common method of UTF8 encoding, UTF8 encoding
This paper summarizes whether PHP detection string is a common method of UTF8 encoding. Share to everyone for your reference. The implementation method is as follows:
There are many ways to detect string encoding, such as the use of Ord to get the character into the judgment, or use the mb_detect_encoding function to deal with, the following four kinds of common methods for everyone to refer to.
Example 1
Copy CodeThe code is as follows:/**
* Detects if the string is UTF8 encoded
* @param string $str the detected strings
* @return Boolean
*/
function Is_utf8 ($STR) {
$len = strlen ($STR);
for ($i = 0; $i < $len; $i + +) {
$c = Ord ($str [$i]);
if ($c > 128) {
if (($c > 247)) return false;
ElseIf ($c > 239) $bytes = 4;
ElseIf ($c > 223) $bytes = 3;
ElseIf ($c > 191) $bytes = 2;
else return false;
if (($i + $bytes) > $len) return false;
while ($bytes > 1) {
$i + +;
$b = Ord ($str [$i]);
if ($b < | | $b > 191) return false;
$bytes--;
}
}
}
return true;
}
Example 2
Copy CodeThe code is as follows: function Is_utf8 ($string) {
Return Preg_match ('%^ (?:
[\x09\x0a\x0d\x20-\x7e] # ASCII
| [\XC2-\XDF] [\X80-\XBF] # Non-overlong 2-byte
| \XE0[\XA0-\XBF][\X80-\XBF] # excluding overlongs
| [\xe1-\xec\xee\xef] [\X80-\XBF] {2} # straight 3-byte
| \XED[\X80-\X9F][\X80-\XBF] # excluding surrogates
| \XF0[\X90-\XBF][\X80-\XBF]{2} # Planes 1-3
| [\xf1-\xf3] [\X80-\XBF] {3} # planes 4-15
| \XF4[\X80-\X8F][\X80-\XBF]{2} # Plane 16
) *$%xs ', $string);
}
Accuracy is basically the same as mb_detect_encoding (), to be right together, to be wrong together.
Code detection is not possible 100% accurate, this thing has been able to basically meet the requirements.
Example 3
Copy CodeThe code is as follows: function Mb_is_utf8 ($string)
{
Return mb_detect_encoding ($string, ' UTF-8 ') = = = ' UTF-8 ';//New discoveries
}
Example 4
Copy the code as follows://Returns True if $string is valid UTF-8 and False otherwise.
function Is_utf8 ($word)
{
if (Preg_match ("/^ ([". chr (228). " -". Chr (233)." {1} [". chr (128)." -". Chr (191)." {1} [". chr (128)." -". Chr (191)." {1}) {1}/", $word) = = True | | Preg_match ("/[". chr (228). " -". Chr (233)." {1} [". chr (128)." -". Chr (191)." {1} [". chr (128)." -". Chr (191)." {1}) {1}$/", $word) = = True | | Preg_match ("/[". chr (228). " -". Chr (233)." {1} [". chr (128)." -". Chr (191)." {1} [". chr (128)." -". Chr (191)." {1}) {2,}/", $word) = = True)
{
return true;
}
Else
{
return false;
}
}//Function Is_utf8
I hope this article is helpful to everyone's PHP programming.
http://www.bkjia.com/PHPjc/915431.html www.bkjia.com true http://www.bkjia.com/PHPjc/915431.html techarticle PHP detection string is a common method of UTF8 encoding, UTF8 coding This article summarizes the PHP detection string is UTF8 encoding is a common method. Share to everyone for your reference. Concrete Reality ...