Phpiconv () encoding conversion error: Detectedanillegalcharacter_PHP tutorial

Source: Internet
Author: User
Phpiconv () encoding conversion error: Detectedanillegalcharacter. Number prototype: stringiconv (string $ in_charset, string $ out_charset, string $ str), especially the second parameter description: theoutputcharset. use iconv () to convert an output character encoding non-count prototype: string iconv (string $ in_charset, string $ out_charset, string $ str)
In particular, the second parameter description:
The output charset.

When iconv () is used to convert a character that is not supported by the output character encoding, such as iconv ('utf-8', 'gb2312 ', 'www. bKjia. c0m'), the following error occurs:

Notice: iconv () [function. iconv]: detected an illegal character in input string...

Because gb2312 represents simplified Chinese, it does not support "www. bKjia. for more complex Chinese characters such as c0m and some special characters, this will certainly report errors. There are two solutions:

1. extended the output character encoding range, such as iconv ('utf-8', 'gbk', 'www. bKjia. c0m'), it can be output correctly, because gbk supports a wider range of characters;

2. add "// ignore" after the output character encoding string, such as iconv ('utf-8', 'gb2312 // ignore ', 'www. bKjia. c0m'). In this way, you can ignore characters that cannot be converted and avoid errors but cannot output them correctly (that is, blank spaces and outputs ).


Next let's take a look at the php Tutorial iconv (): detected an illegal character in input string processing method.

$ Str = iconv ('utf-8', 'gbk // ignore ', unescape (isset ($ _ get ['str'])? $ _ Get ['str']: '');
In the local test, // ignore can ignore the words it does not know and then turn them down without an error. // Transcoder intercepts the words it does not know and the subsequent content, and reports an error. // Ignore is what I need.
Now wait for the launch to see the results (this is not a good practice, continue to ponder the manual, search for it online), haha...

Find the following article on the internet and find that mb_convert_encoding is acceptable, but the efficiency is worse than iconv.


Differences between iconv and mb_convert_encoding

Iconv-convert string to requested character encoding (php 4> = 4.0.5, php 5)
Mb_convert_encoding-convert character encoding (php 4> = 4.0.6, php 5)

Usage:
String mb_convert_encoding (string str, string to_encoding [, mixed from_encoding])
Enable the mbstring extension Library first. in php. ini, remove the extension before php_mbstring.dll.

String iconv (string in_charset, string out_charset, string str)
Note:
The second parameter, in addition to specifying the encoding to be converted, can also add two suffixes: // transcoder and // ignore,
Where:
// Transcoder automatically converts a character that cannot be directly converted into one or more similar characters,
// Ignore ignores the characters that cannot be converted. by default, it is truncated from the first invalid character.
Returns the converted string or false on failure.

Usage:
1. it is found that iconv will encounter an error when converting the character "-" to gb2312. if the ignore parameter is not available, all strings after this character cannot be saved. In any case, the "-" cannot be converted successfully or output. In addition, mb_convert_encoding does not have this bug.
2. mb_convert_encoding can specify multiple input encodings, which are automatically identified based on the content, but the execution efficiency is much lower than that of iconv. for example: $ str = mb_convert_encoding ($ str, "euc-jp ", "ascii, jis, euc-jp, sjis, utf-8"); the order of "ascii, jis, euc-jp, sjis, UTF-8" varies with the effect.
3. generally, iconv is used. the mb_convert_encoding function is used only when the encoding of the original encoding cannot be determined or The iconv cannot be displayed normally after conversion.

From_encoding is specified by character code name before conversion. it can be array or string-comma separated enumerated list. if it is not specified, the internal encoding will be used.

$ Str = mb_convert_encoding ($ str, "ucs-2le", "jis, eucjp-win, sjis-win ");
$ Str = mb_convert_encoding ($ str, "euc-jp '," auto ");

Iconv (string $ in_charset, string $ out_charset, string $ str), especially the second parameter description: the output charset. converting an output character encoding using iconv () is not supported...

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.