_php techniques for using Iconv functions in PHP

Source: Internet
Author: User
Tags ini php programming translit
Iconv function Library can complete the conversion between various character sets, which is an indispensable basic function library in PHP programming.
1, download Libiconv function library http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.9.2.tar.gz;
2, decompression TAR-ZXVF libiconv-1.9.2.tar.gz;
3. Installation Libiconv
#configure--prefix=/usr/local/iconv
#make
#make Install
4, recompile PHP to increase the compilation parameters--with-iconv=/usr/local/iconv

under Windows

Recently in doing a thief program, need to use the ICONV function to crawl over the Utf-8 encoded page into a gb2312, found that only use the ICONV function to crawl over the data a transcoding data will be less for some reason. Let me depressed for a while, go to the Internet to check the data to know that this is the iconv function of a bug. Iconv error when converting character "-" to gb2312
The solution is simple: After the code that needs to be converted, add "//ignore", which is the second parameter of the ICONV function.

The following are the referenced contents:
Copy Code code as follows:

Iconv ("UTF-8", "Gb2312//ignore", $data)

Ignore means ignoring the conversion error, and if there is no ignore argument, all the strings after that character cannot be saved.
Copy Code code as follows:

<?php
echo $str = ' Hello, here is coffee! ';
echo ' <br/> ';
echo iconv (' GB2312 ', ' UTF-8 ', $str); To transfer the string encoding from GB2312 to UTF-8
echo ' <br/> ';
Echo Iconv_substr ($STR, 1, 1, ' UTF-8 '); Intercept by number of characters rather than bytes
Print_r (Iconv_get_encoding ()); Get current page encoding information
Echo Iconv_strlen ($str, ' UTF-8 '); Gets the string length of the set encoding
That's what it's for.
$content = Iconv ("UTF-8", "Gbk//translit", $content);
?>

Iconv is not the default function of PHP and is the default installed module. Requires installation to be used.
If it is windows2000+php, you can modify the php.ini file to Extension=php_iconv.dll before the ";" Remove and at the same time you want to copy your original PHP installation file under the Iconv.dll to your winnt/system32 (if your DLL is pointing to this directory)
In the Linux environment, the static installation of the way, in the configure when adding more than one--with-iconv can be, phpinfo see iconv items. (linux7.3+apache4.06+php4.3.2),

Download: ftp://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.8.tar.gz
Installation:
#cp LIBICONV-1.8.TAR.GZ/USR/LOCAL/SRC
#tar ZXVF lib*
#./configure--prefix=/usr/local/libiconv
#make
#make Install
Compiling PHP
#./configure--prefix=/usr/local/php4.3.2--with-iconv=/usr/local/libiconv/
simple examples to use:

<?php
Echo iconv ("gb2312", "iso-8859-1", "we");
?>

Introduction to Mb_convert_encoding and ICONV functions in PHP

Mb_convert_encoding This function is used to convert the encoding. The concept of coding the program has been not understood, but now seems to be a bit enlightened.
But English does not usually have coding problems, only Chinese data will have this problem. For example, when you use Zend Studio or EditPlus to write the program, using the GBK code, if the data needs to enter the database, and the database code for UTF8, then the data will be encoded conversion, or into the database would become garbled.

Mb_convert_encoding's usage See official:
http://cn.php.net/manual/zh/function.mb-convert-encoding.php

Make a GBK to UTF-8
< PHP
Header ("content-type:text/html; Charset=utf-8 ");
Echo mb_convert_encoding ("You Are my Friend", "UTF-8", "GBK");
?>

And a GB2312 to Big5.
< PHP
Header ("content-type:text/html; Charset=big5 ");
Echo mb_convert_encoding ("You Are my Friend", "Big5", "GB2312");
?>
However, you need to use the above function to install but you need to enable the Mbstring extension library first.

Another function in PHP, Iconv, is also used to convert string encodings, similar to the functions on functions.

Here are a few more examples:
Iconv-convert string to requested character encoding
(PHP 4 >= 4.0.5, PHP 5)
Mb_convert_encoding-convert character encoding
(PHP 4 >= 4.0.6, PHP 5)

Usage:
String mb_convert_encoding (String str, string to_encoding [, mixed from_encoding])
Need to enable Mbstring expansion Library, in the php.ini; Extension=php_mbstring.dll in front of; Remove
Mb_convert_encoding can specify a variety of input encodings, which are automatically recognized based on content, but perform much less efficiently than iconv;


String Iconv (String in_charset, String out_charset, String str)
Note: The second parameter, in addition to specifying the encoding to be converted, can add two suffixes://translit and//ignore, where//translit automatically converts characters that cannot be directly converted into one or more approximate characters,//ignore Ignores characters that cannot be converted, and the default effect is to truncate from the first illegal character.
Returns the converted string or FALSE on failure.


Use:

Iconv found that there was an error in converting the character "-" to gb2312, and if there were no ignore arguments, all the strings after that character could not be saved. In any case, this "-" cannot be converted successfully and cannot be exported. In addition Mb_convert_encoding does not have this bug.

In general, the Mb_convert_encoding function is used only when the ICONV is encountered that cannot determine what encoding the original encoding is, or if the iconv is not displayed properly after conversion.

From_encoding is specified by character code name before conversion. It can be an array or STRING-COMMA separated enumerated list. If It is not specified, the internal encoding would be used.
/* Auto detect encoding from JIS, Eucjp-win, Sjis-win, then convert str to UCS-2LE * *
$str = mb_convert_encoding ($str, "Ucs-2le", "JIS, Eucjp-win, Sjis-win");
/* "Auto" is expanded to "ascii,jis,utf-8,euc-jp,sjis" * *
$str = mb_convert_encoding ($str, "EUC-JP", "Auto");

Example:
$content = Iconv ("GBK", "Utf-8″, $content");
$content = mb_convert_encoding ($content, "Utf-8″", "GBK");

parameters that are easy to ignore when using the Iconv function in PHP
Today, when processing crawl content, when using Iconv for encoding conversion, found that the result will be interrupted, guess is the character set problem, consider how to skip the target character set does not exist characters, check the manual found that iconv function only three parameters, as if not, and then check on the Internet that some people say can, But very strange how to achieve, finally found that English description can be added to the target code behind: "Translit", very depressed how to add it? The original is to add "//", really depressed, unexpectedly have such a design
Prototype: $txtContent = Iconv ("Utf-8", ' GBK ', $txtContent);

Special parameters: Iconv ("UTF-8", "Gb2312//ignore", $data)


Two optional auxiliary parameters: Translit and ignore, where ignore is skipped when it is encountered that cannot be converted. Description

String Iconv (String in_charset, String out_charset, String str)

Performs a character set conversion on the string str from In_charset to Out_charset. Returns the converted string or FALSE on failure.

If you are append the string//translit to Out_charset transliteration are activated. This means so when a character can ' t is represented in the target charset, it can be approximated through one or several Similarly looking characters. If you append the string//ignore, characters that cannot is represented in the target charset are silently.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.