I QQ group also has a lot of technical documents, I hope to provide you with some help (non-technical not to add).
QQ Group: 281442983 (click the link to join the group: http://jq.qq.com/?_wv=1027&k=29LoD19)
In PHP function library has a function: Iconv (), Iconv function library can complete the conversion between various character sets, is an indispensable base Function library in PHP programming.
Recently in a thief program, need to use the ICONV function to fetch the UTF-8 encoded page into gb2312, found that only with the ICONV function to fetch the data of a transcoding data will be less for no reason. Let me depressed for a while, go online a check information to know this is a iconv function of a bug. Iconv error when converting character "-" to gb2312.
Let's take a look at the usage of this function.
The simplest application, replace the gb2312 into Utf-8:
1 |
$text =iconv( "GB2312" , "UTF-8" , $text ); |
In the process of using $text=iconv ("UTF-8", "GB2312", $text), if some special characters are encountered, such as: "-", "." in English name. And so on character, the conversion is broken off. The text after these characters is not able to continue the conversion.
The following code can be implemented for this problem:
1 |
$text =iconv( "GBK" , "UTF-8" , $text ); |
You have not read wrong, it is so simple, do not use gb2312, and write GBK, just can.
There is another way, the second argument, plus//ignore, ignores the error, as follows:
1 |
iconv( "UTF-8" , "GB2312//IGNORE" , $data ); |
There is no specific comparison between the two methods, the feeling of the first (GBK instead of gb2312) method is better.
Instructions for Iconv () in the PHP manual:
3 |
(PHP 4 >= 4.0.5, PHP 5) |
4 |
iconv – Convert string to requested character encoding |
6 |
string iconv ( string in_charset, string out_charset, string str ) |
7 |
Performs a character set conversion on the string str from in_charset to out_charset. Returns the converted string or FALSE on failure. |
8 |
If you append the string //TRANSLIT to out_charset transliteration is activated. This means that when a character can‘t be represented in the target charset, it can be approximated through one or several similarly looking characters. If you append the string //IGNORE, characters that cannot be represented in the target charset are silently discarded. Otherwise, str is cut from the first illegal character. |
When using this function for string encoding conversions, it is important to note that if you convert Utf-8 to gb2312, a string truncation can occur. This can be resolved using the following methods:
1 |
$str =iconv( ‘utf-8‘ , "gb2312//TRANSLIT" , file_get_contents ( $filepath )); |
That is, in the second parameter, add the Red Word section, meaning: If you cannot find a character that matches the source encoding in the target encoding, a similar character is selected for conversion. This can also be used here://ignore This parameter, which means that characters that cannot be converted are ignored.
Ignore means ignoring errors at the time of conversion, and if there is no ignore argument, all strings after that character cannot be saved.
Iconv is not the default function for PHP, it is also the default installed module. Need to be installed to use.
If it is windows2000+php, you can modify the php.ini file, will extension=php_iconv.dll before the ";" Remove and copy the Iconv.dll under your original PHP installation file to your Winnt/system32 (if your DLL is pointing to this directory). In the Linux environment, the static installation of the way, in the configure when adding a--with-iconv on it, phpinfo see iconv items. (linux7.3+apache4.06+php4.3.2).
Introduction to Mb_convert_encoding and ICONV functions
Mb_convert_encoding This function is used to convert the encoding. It turns out that the concept of program coding is not understood, but it seems to be a bit enlightened now. However, there is generally no coding problem in English, only the Chinese data will have this problem. For example, when you write a program with Zend Studio or editplus, using GBK encoding, if the data needs to enter the database, and the database encoding is UTF8, then the data will be encoded conversion, or into the database will become garbled.
Do a GBK to UTF-8:
2 |
header( "content-Type: text/html; charset=Utf-8" ); |
3 |
echo mb_convert_encoding( "妳係我的友仔" , "UTF-8" , "GBK" ); |
One more GB2312 to Big5:
2 |
header( "content-Type: text/html; charset=big5" ); |
3 |
echo mb_convert_encoding( "你是我的朋友" , "big5" , "GB2312" ); |
However, to use the above function requires installation but first enable the mbstring extension library.
String mb_convert_encoding (String str, string to_encoding [, mixed from_encoding]) need to enable the Mbstring extension library first, in php.ini; Extension=php_mbstring.dll in front of; Remove mb_convert_encoding can specify a variety of input encoding, it will automatically identify according to the content, but the execution efficiency is much worse than the iconv;
String Iconv (String in_charset, String out_charset, String str) Note: The second parameter, in addition to the encoding you can specify to convert to, can also add two suffixes://translit and//ignor E, where//translit automatically converts characters that cannot be converted directly into one or more approximate characters,//ignore ignores characters that cannot be converted, and the default effect is to truncate from the first illegal character.
In general, with Iconv, only use the Mb_convert_encoding function if you encounter an inability to determine what encoding the original encoding is, or if the Iconv conversion fails to display properly.
1 |
$content = iconv( "GBK" , "UTF-8″, $content ); |
2 |
$content = mb_convert_encoding( $content , "UTF-8″, " GBK"); |
I QQ group also has a lot of technical documents, I hope to provide you with some help (non-technical not to add).
QQ Group: 281442983 (click the link to join the group: http://jq.qq.com/?_wv=1027&k=29LoD19)
PHP iconv function Conversion error problem