PHP Judgment string encoding function mb_detect_encoding Summary
Iconv-convert string to requested character encoding (PHP 4 >= 4.0.5, PHP 5)
Mb_convert_encoding-convert character encoding (PHP 4 >= 4.0.6, PHP 5)
iconv-strings are converted as required character encodings
mb_convert_encoding-encoding of converted characters
These two functions are similar in that they are used to convert string encodings;
Usage:
String mb_convert_encoding (String str, string to_encoding [, mixed from_encoding])
Note: You need to first enable the Mbstring extension library, in php.ini; Extension=php_mbstring.dll in front of; Remove
Parameters: str--to encode the STR, TO_ENCODING--STR to be converted to the encoding type, from_encoding--is specified by the character code name before conversion. It can be an array or a comma-delimited list of enumerations. If from_encoding is not provided, the internal (internal) encoding is used. See supported encodings.
Supported character encodings
The current mbstring module supports the following character encodings. Any of these character encodings can be assigned to the encoding parameter in the Mbstring function.
The PHP extension supports the following character encodings:
ucs-4*
Ucs-4be
ucs-4le*
UCS-2
Ucs-2be
Ucs-2le
utf-32*
utf-32be*
utf-32le*
utf-16*
utf-16be*
utf-16le*
UTF-7
Utf7-imap
utf-8*
ascii*
euc-jp*
sjis*
eucjp-win*
sjis-win*
Iso-2022-jp
Iso-2022-jp-ms
CP932
CP51932
sjis-mac** (alias: Macjapanese)
sjis-mobile#docomo** (alias: Sjis-docomo)
sjis-mobile#kddi** (alias: Sjis-kddi)
sjis-mobile#softbank** (alias: Sjis-softbank)
utf-8-mobile#docomo** (alias: Utf-8-docomo)
utf-8-mobile#kddi-a**
utf-8-mobile#kddi-b** (alias: Utf-8-kddi)
utf-8-mobile#softbank** (alias: Utf-8-softbank)
iso-2022-jp-mobile#kddi** (alias: Iso-2022-jp-kddi)
Jis
Jis-ms
CP50220
Cp50220raw
CP50221
CP50222
Iso-8859-1*
iso-8859-2*
iso-8859-3*
iso-8859-4*
iso-8859-5*
iso-8859-6*
iso-8859-7*
iso-8859-8*
iso-8859-9*
iso-8859-10*
iso-8859-13*
iso-8859-14*
iso-8859-15*
Byte2be
Byte2le
Byte4be
Byte4le
BASE64
Html-entities
7bit
8bit
euc-cn*
CP936
gb18030**
Hz
euc-tw*
CP950
big-5*
euc-kr*
UHC (CP949)
Iso-2022-kr
Windows-1251 (CP1251)
Windows-1252 (CP1252)
CP866 (IBM866)
koi8-r*
* Indicates that the encoding can also be used in regular expressions.
* * Indicates that the encoding is available from PHP 5.4.0.
Any php.ini entry that accepts an encoded name can also use the value "Auto" and "pass". The mbstring function that accepts the encoded name can also use the value "Auto".
If "Pass" is set, the character encoding will not be converted.
If "Auto" is set, it expands to a list of each character encoding defined in the NLS. For example, assuming that the NLS is set to Japanese, the value will be considered "Ascii,jis,utf-8,euc-jp,sjis".
NLS: National Language Support (Language)
String Iconv (String in_charset, String out_charset, String str)
Attention:
The second parameter, in addition to specifying the encoding to be converted to, can also add two suffixes://translit and//ignore,
which
Translit automatically converts characters that cannot be converted directly into one or more approximate characters,
IGNORE ignores characters that cannot be converted, and the default effect is to truncate from the first illegal character.
Returns the converted string or FALSE on failure. (returns the converted string, or false if execution fails.) )
use:
1. Iconv is found to have an error converting the character "-" to gb2312, and if there is no ignore parameter, all strings after that character cannot be saved. In any case, this "-" can not be converted successfully, unable to output. Another mb_convert_encoding does not have this bug.
2. Mb_convert_encoding can specify a number of input encodings, which are automatically recognized based on content, but the execution efficiency is much worse than iconv, such as:
$str = mb_convert_encoding ($str, "EUC-JP", "Ascii,jis,euc-jp,sjis,utf-8"); " Ascii,jis,euc-jp,sjis,utf-8 "The order of the different effects also vary.
3. In general, with Iconv, only use mb_convert_encoding function if you encounter an inability to determine what encoding the original encoding is, or if the Iconv conversion does not display properly.
From_encoding is specified by character code name before conversion. It can be array or String-comma separated
Enumerated list. If It is not specified, the internal encoding would be used.
$str = mb_convert_encoding ($str, "Ucs-2le", "JIS, Eucjp-win, Sjis-win");
$str = mb_convert_encoding ($str, "EUC-JP", "Auto");
Example:
$content = Iconv ("GBK", "UTF-8", $content);
$content = mb_convert_encoding ($content, "UTF-8", "GBK");
/* Convert the internal code to SJIS */$str = mb_convert_encoding ($str, "SJIS");/* Convert EUC-JP to UTF-7 */$str = mb_convert_encoding ($str, "UTF-7 "," EUC-JP ");/* automatically detects the encoding from JIS, Eucjp-win, Sjis-win, and converts str to Ucs-2le */$str = mb_convert_encoding ($str," Ucs-2le "," JIS, E Ucjp-win, Sjis-win ");/*" Auto "expands to" ascii,jis,utf-8,euc-jp,sjis "*/$str = mb_convert_encoding ($str," EUC-JP "," Auto ");
$text = "This is the Euro symbol ' € '."; Echo ' Original: ', $text, Php_eol;echo ' translit: ', Iconv ("UTF-8", "Iso-8859-1//translit", $text), Php_eol;echo ' IGNORE : ', Iconv ("UTF-8", "Iso-8859-1//ignore", $text), Php_eol;echo ' Plain : ', Iconv ("UTF-8", "iso-8859-1", $text), php_eol; output result: Original:this is the Euro symbol ' € '. Translit:this is the Euro symbol ' EUR '. IGNORE: This is the Euro symbol '. Plain : Notice:iconv (): detected an illegal character in input string in. \iconv-example.php on line 7This are the Euro Symbol '