Phpiconv (): Detectedanillegalcharacterininputstring

Source: Internet
Author: User
PHP to JS string with ecsape conversion added to the url, and received with PHP, and then find the unscape function to convert the internet, so the string is UTF-8, but what I need is GB2312, therefore, iconv conversion is used.

PHP to JS string with ecsape conversion added to the url, and received with PHP, and then find the unscape function to convert the internet, so the string is UTF-8, but what I need is GB2312, therefore, iconv conversion is used.

It was used in the beginning.
$ Str = iconv ('utf-8', 'gb2312', unescape (isset ($ _ GET ['str'])? $ _ GET ['str']: '');
After going online, a bunch of such errors are reported: iconv (): Detected an illegal character in input string

Considering that the GB2312 character set is relatively small and the server space is large, change it to GBK:
$ Str = iconv ('utf-8', 'gbk', unescape (isset ($ _ GET ['str'])? $ _ GET ['str']: '');
The same error is reported after going online!

Read the manual carefully and find that there is such a paragraph:
If you append the string // Transcoder to out_charset transliteration is activated. this means that when a character can't be represented in the target charset, it can be approximated through one or several similarly looking characters. if you append the string // IGNORE, characters that cannot be represented in the target charset are silently discarded. otherwise, str is cut from the first illegal character.
Changed:
$ Str = iconv ('utf-8', 'gbk // IGNORE ', unescape (isset ($ _ GET ['str'])? $ _ GET ['str']: '');
In the local test, // IGNORE can IGNORE the words it does not know and then turn them down without an error. // Transcoder intercepts the words it does not know and the subsequent content, and reports an error. // IGNORE is what I need.
Now wait for the launch to see the results (this is not a good practice, continue to ponder the manual, search for the Internet), Hong Kong virtual host, haha...

Find the following article on the Internet and find that mb_convert_encoding is acceptable, but the efficiency is worse than iconv.


Differences between iconv and mb_convert_encoding

Iconv-Convert string to requested character encoding (PHP 4> = 4.0.5, PHP 5)
Mb_convert_encoding-Convert character encoding (PHP 4> = 4.0.6, PHP 5)

Usage:
String mb_convert_encoding (string str, string to_encoding [, mixed from_encoding])
Enable the mbstring extension Library first. in php. ini, remove the extension before php_mbstring.dll.

String iconv (string in_charset, string out_charset, string str)
Note:
The second parameter, in addition to specifying the encoding to be converted, can also add two suffixes: // Transcoder and // IGNORE,
Where:
// Transcoder automatically converts a character that cannot be directly converted into one or more similar characters,
// IGNORE ignores the characters that cannot be converted. By default, it is truncated from the first invalid character.
Returns the converted string or FALSE on failure.

Usage:
1. It is found that iconv will encounter an error when converting the character "-" To gb2312. The Hong Kong server is rented. Without the ignore parameter, all strings after this character cannot be saved. In any case, the "-" cannot be converted successfully or output. In addition, mb_convert_encoding does not have this bug.
2. mb_convert_encoding can specify multiple input encodings, which are automatically identified based on the content, but the execution efficiency is much lower than that of iconv. For example: $ str = mb_convert_encoding ($ str, "euc-jp ", "ASCII, JIS, EUC-JP, SJIS, UTF-8"); the order of "ASCII, JIS, EUC-JP, SJIS, UTF-8" is different.
3. Generally, iconv is used. The mb_convert_encoding function is used only when the encoding of the original encoding cannot be determined or the iconv cannot be displayed normally after conversion.

From_encoding is specified by character code name before conversion. it can be array or string-comma separated enumerated list. If it is not specified, the internal encoding will be used.

$ Str = mb_convert_encoding ($ str, "UCS-2LE", "JIS, eucjp-win, sjis-win ");
$ Str = mb_convert_encoding ($ str, "EUC-JP", "auto ");

Example:
$ Content = iconv ("GBK", "UTF-8", $ content );
$ Content = mb_convert_encoding ($ content, "UTF-8", "GBK ");

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.