Using Libiconv library transcoding under Linux

Source: Internet
Author: User

the Iconv command implements the conversion of the Linux character set encodingFiles under Windows are copied to Linux often garbled, because the file encoding under Windows Gbk,linux under the default file encoding UTF-8, it is necessary to Libiconv library transcoding. The 1.iconv command uses the following: Iconv [Options ...] [File ...] Input/Output format specification:-F,--from-code= name raw text encoding-T,--to-code= name output encoding information:-L,--list enumerates all known character Set output controls:-C ignores invalid characters from output-O,--output=file output File-S,--silent close warning--verbose Printing Progress information converts GBK encoded file HOSTS_GBK to UTF-8 encoded HOSTS_UTF8. Examples are: iconv-f gbk-t UTF-8 HOSTS_GBK > HOS Ts_utf8 2.iconv Function:

(1) iconv_t iconv_open (const char *tocode, const char *fromcode);
This function describes which two encodings will be converted, Tocode is the target encoding, Fromcode is the original encoding, and the function returns a transform handle for use by the following two functions.

Detailed Description: Http://linux.die.net/man/3/iconv_open

(2) size_t iconv (iconv_t cd,char **inbuf,size_t *inbytesleft,char **outbuf,size_t *outbytesleft);
This function reads the characters from the inbuf and outputs them to outbuf, inbytesleft to record the number of characters that have not yet been converted, outbytesleft to record the remaining space of the output buffer.

Detailed Description: http://pubs.opengroup.org/onlinepubs/009695399/functions/iconv.html

In most cases, inbuf is not null, and*inbuf is not null. In this case, theiconv function converts a multibyte sequence starting with *inbuf to a multibyte sequence starting with *outbuf . Reads starting from *inbuf , up to *inbytesleft bytes, after conversion, writes from *outbuf , up to *outbytesleft bytes.

  The Iconv function converts one multibyte character at a time, each character conversion,*inbuf increases the number of bytes converted,*inbytesleft correspondingly reduces the number of bytes converted; correspondingly,*outbuf and *outbytesleft to modify the CD 's conversion status, so that the corresponding operation to the copy.

The conversion cannot be completed in the following four scenarios:

1. The input contains an invalid multibyte sequence. At this point, it sets errno to eilseq and Returns (size_t) (-1). The *inbuf points to the leftmost end of the invalid sequence.

2. The input byte sequence has all been converted, that is, the *inbytesleft is reduced to 0. At this point,iconv Returns the number of transformations completed in this call (the reversible conversion does not count toward).

3. The input is terminated with an incomplete multibyte sequence. At this point, it sets errno to EINVAL and Returns (size_t) (-1). *inbuf points to the leftmost end of an incomplete multibyte sequence.

4. The output buffer does not have enough space to store the next character. At this point, it sets errno to e2big and Returns (size_t) (-1).

Another scenario is that inbuf is null or *inbuf is null, but *outbuf is not null and*outbuf is not null. In this case, the ICONV function attempts to set the transition state of the CD to the initial state and store a corresponding shift sequence at *outbuf. Starting from *outbuf , writes up to *outbytesleft bytes. If the output buffer does not have enough space to store the reset sequence, he sets the errno to e2big and Returns (size_t) (-1). Conversely,*outbuf increases the number of bytes written and *outbytesleft decreases the number of bytes written.

The third scenario is that inbuf is null or *inbuf is null,*outbuf is null, or *outbuf is null. In this case, the ICONV function attempts to set the transition state of the CD to its initial state.

return value:

The Iconv function returns the number of characters converted in this call, and the reversible conversion does not count toward it. When an error occurs, it modifies the errno and Returns (size_t) (-1).

Error:

There is not enough room for E2big *outbuf .

The EILSEQ input contains an invalid multibyte sequence.

The EINVAL input contains an incomplete multibyte sequence.

(3) int iconv_close (iconv_t CD);
This function closes the transform handle and frees the resource.

C code example
//Convert Utf-8 to gb2312, code for reference only.
iconv_t cd;Charsrc_utf8[ -]="UTF8 Encoding";Char*inbuf=Src_utf8;intinen=strlen (INBUF);intoutlen=255;Char*outbuf= (Char*) malloc (Outlen); CD=iconv_open ("gb2312","Utf-8"); Iconv (CD,&inbuf, (size_t *) &inlen,&outbuf,&Outlen);p rintf ("%s\n", Outbuf); Iconv_close (CD); free (outbuf) ;


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.