In the PHP character encoding conversion we generally use ICONV and mb_convert_encoding to operate, but mb_convert_encoding in the conversion performance than iconv much worse oh.
String Iconv (String in_charset, String out_charset, String str) Note: The second parameter, in addition to specifying the encoding to be converted, can add two suffixes://translit and//ignor E, where//translit automatically converts characters that cannot be directly converted into one or more approximate characters,//ignore ignores characters that cannot be converted, and the default effect is to truncate from the first illegal character.
Returns the converted string or FALSE on failure.
String mb_convert_encoding (String str, string to_encoding [, mixed from_encoding])
Need to enable Mbstring expansion Library, in the php.ini; Extension=php_mbstring.dll in front of; Remove
Mb_convert_encoding can specify a variety of input encodings, which are automatically recognized based on content, but perform much less efficiently than iconv;
Use:
Iconv found that there was an error in converting the character "-" to gb2312, and if there were no ignore arguments, all the strings after that character could not be saved. In any case, this "-" cannot be converted successfully and cannot be exported. In addition Mb_convert_encoding does not have this bug.
In general, the Mb_convert_encoding function is used only when the ICONV is encountered that cannot determine what encoding the original encoding is, or if the iconv is not displayed properly after conversion.
Copy Code code as follows:
/**
* Automatically judge to convert GBK or gb2312 encoded strings into UTF8
* Can automatically judge the input string encoding class, if itself is utf-8 without conversion, otherwise converted to Utf-8 string
* The supported character encoding type is: utf-8,gbk,gb2312
*@ $str: String strings
*/
function Yang_gbk2utf8 ($STR) {
$charset = mb_detect_encoding ($str, Array (' UTF-8 ', ' GBK ', ' GB2312 '));
$charset = Strtolower ($charset);
if (' cp936 ' = $charset) {
$charset = ' GBK ';
}
if ("Utf-8"!= $charset) {
$str = Iconv ($charset, "Utf-8//ignore", $str);
}
return $str;
}
Now I'll look at some of the problems in converting character encodings
Use the mb_detect_encoding ($STR) function, which must open the extension=php_mbstring.dll extension of PHP
Copy Code code as follows:
<?php
$str = "Test ing";
$cha =mb_detect_encoding ($STR);
$s = Iconv ($cha, "UTF-8", $str);
Var_dump ($s);
?>
The result returns:
String (0) ""
It's strange why this is so.
Copy Code code as follows:
<?php
$str = "Test ing";
$cha =mb_detect_encoding ($STR);
$s = Iconv ("GB2312", "UTF-8", $str);
Var_dump ($s);
?>
Returns the correct result. The function mb_detect_encoding ($STR) was found, and the judgment was not accurate. I don't know what the reason is.
function String mb_convert_encoding (string $str, String $to _encoding [, Mixed $from _encoding])
Can be converted to the specified encoded string, I wrote an example
Copy Code code as follows:
<pre lang= "php" line= "1" >
<?php
$a = "I'm fine";
Echo mb_convert_encoding ($a, ' UTF-8 ');
?>
But the result is:
?? Lu Lu?
Now the question is, if I convert the different string encoding form to utf-8, if I know the change in advance, I can use iconv, but what if I don't know the code?
Problem 3:iconv problem, if the converted string, the first byte encoding greater than a certain number will return null.
Such as:
Copy Code code as follows:
<?php
$str =CHR (254). " Test ing ". chr (254);
$s = Iconv ("GB2312", "UTF-8", $str);
Var_dump ($s);
?>
Return
String (0) ""
Mb_convert_encoding's usage See official:
http://cn.php.net/manual/en/function.mb-convert-encoding.php
Another function in PHP, Iconv, is also used to convert string encodings, similar to the functions on functions.
Here are a few more examples:
Iconv-convert string to requested character encoding
(PHP 4 >= 4.0.5, PHP 5)
Mb_convert_encoding-convert character encoding
(PHP 4 >= 4.0.6, PHP 5)
Usage:
String mb_convert_encoding (String str, string to_encoding [, mixed from_encoding])
Need to enable Mbstring expansion Library, in the php.ini; Extension=php_mbstring.dll in front of; Remove
Mb_convert_encoding can specify a variety of input encodings, which are automatically recognized based on content, but perform much less efficiently than iconv;
String Iconv (String in_charset, String out_charset, String str)
Note: The second parameter, in addition to specifying the encoding to be converted, can add two suffixes://translit and//ignore, where//translit automatically converts characters that cannot be directly converted into one or more approximate characters,//ignore Ignores characters that cannot be converted, and the default effect is to truncate from the first illegal character.
Returns the converted string or FALSE on failure.
Use:
Iconv found that there was an error in converting the character "-" to gb2312, and if there were no ignore arguments, all the strings after that character could not be saved. In any case, this "-" cannot be converted successfully and cannot be exported. In addition Mb_convert_encoding does not have this bug.
In general, the Mb_convert_encoding function is used only when the ICONV is encountered that cannot determine what encoding the original encoding is, or if the iconv is not displayed properly after conversion.
From_encoding is specified by character code name before conversion. It can be an array or STRING–COMMA separated enumerated list. If It is not specified, the internal encoding would be used.
/* Auto detect encoding from JIS, Eucjp-win, Sjis-win, then convert str to UCS-2LE * *
$str = mb_convert_encoding ($str, "Ucs-2le", "JIS, Eucjp-win, Sjis-win");
/* "Auto" is expanded to "ascii,jis,utf-8,euc-jp,sjis" * *
$str = mb_convert_encoding ($str, "EUC-JP", "Auto");
Example:
Copy Code code as follows:
<?php
$content = Iconv ("GBK", "UTF-8", $content);
$content = mb_convert_encoding ($content, "UTF-8", "GBK");
?>
This can be converted based on the character encoding of the input and output
Copy Code code as follows:
<?php
function Phpcharset ($data, $to) {
if (Is_array ($data)) {
foreach ($data as $key => $val) {
$data [$key] = Phpcharset ($val, $to);
}
} else {
$encode _array = Array (' ASCII ', ' UTF-8 ', ' GBK ', ' GB2312 ', ' BIG5 ');
$encoded = mb_detect_encoding ($data, $encode _array);
$to = Strtoupper ($to);
if ($encoded!= $to) {
$data = mb_convert_encoding ($data, $to, $encoded);
}
}
return $data;
}
?>