Prototype: string iconv (string $ in_charset, string $ out_charset, string $ str)
In particular, the second parameter description:
The output charset.
When iconv () is used to convert a character that is not supported by the output character encoding, such as iconv ('utf-8', 'gb2312 ', 'www .111cn.net '), the following error message is displayed:
Notice: iconv () [function. iconv]: detected an illegal character in input string...
Because gb2312 represents simplified Chinese, it does not support more complex Chinese characters such as "www.111cn.net" and some special characters. Of course, an error is reported. There are two solutions:
1. extend the encoding range of output characters, such as iconv ('utf-8', 'gbk', and 'www .111cn.net '). The output is correct because gbk supports a wider range of characters;
2. add "// ignore" to the output character encoding string, for example, iconv ('utf-8', 'gb2312 // ignore ', 'www .111cn.net '), in this way, you can ignore characters that cannot be converted and avoid errors but cannot output them correctly (that is, blank spaces and outputs ).
Next let's take a look at the php Tutorial iconv (): detected an illegal character in input string processing method.
$ Str = iconv ('utf-8', 'gbk // ignore ', unescape (isset ($ _ get ['str'])? $ _ Get ['str']: '');
In the local test, // ignore can ignore the words it does not know and then turn them down without an error. // Transcoder intercepts the words it does not know and the subsequent content, and reports an error. // Ignore is what I need.
Now wait for the launch to see the results (this is not a good practice, continue to ponder the manual, search for it online), haha...
Find the following article on the internet and find that mb_convert_encoding is acceptable, but the efficiency is worse than iconv.
Differences between iconv and mb_convert_encoding
Iconv-convert string to requested character encoding (php 4> = 4.0.5, php 5)
Mb_convert_encoding-convert character encoding (php 4> = 4.0.6, php 5)
Usage:
String mb_convert_encoding (string str, string to_encoding [, mixed from_encoding])
Enable the mbstring extension Library first. In php. ini, remove the extension before php_mbstring.dll.
String iconv (string in_charset, string out_charset, string str)
Note:
The second parameter, in addition to specifying the encoding to be converted, can also add two suffixes: // transcoder and // ignore,
Where:
// Transcoder automatically converts a character that cannot be directly converted into one or more similar characters,
// Ignore ignores the characters that cannot be converted. By default, it is truncated from the first invalid character.
Returns the converted string or false on failure.
Usage:
1. It is found that iconv will encounter an error when converting the character "-" to gb2312. If the ignore parameter is not available, all strings after this character cannot be saved. In any case, the "-" cannot be converted successfully or output. In addition, mb_convert_encoding does not have this bug.
2. mb_convert_encoding can specify multiple input encodings, which are automatically identified based on the content, but the execution efficiency is much lower than that of iconv. For example: $ str = mb_convert_encoding ($ str, "euc-jp ", "ascii, jis, euc-jp, sjis, utf-8"); the order of "ascii, jis, euc-jp, sjis, UTF-8" varies with the effect.
3. Generally, iconv is used. The mb_convert_encoding function is used only when the encoding of the original encoding cannot be determined or The iconv cannot be displayed normally after conversion.
From_encoding is specified by character code name before conversion. it can be array or string-comma separated enumerated list. if it is not specified, the internal encoding will be used.
$ Str = mb_convert_encoding ($ str, "ucs-2le", "jis, eucjp-win, sjis-win ");
$ Str = mb_convert_encoding ($ str, "euc-jp '," auto ");