from:http://blog.csdn.net/qianguozheng/article/details/46429245
Code Conversion Action ClassesClass Codeconverter {Privateiconv_t cd;PublicStructureCodeconverter (const char *from_charset,const Char *to_charset) {cd = Iconv_open (To_charset,from_charset);}Destruction~codeconverter () {Iconv_close (CD);}Conversion outputint convert (char *inbuf,int inlen,char *outbuf,int outlen) {Char **pin =&inbuf;Char **pout =&outbuf;memset (Outbuf,0,outlen);Return Iconv (Cd,pin, (size_t *) &inlen,pout, (size_t *) &outlen);}};int main (int argc, char **argv){Char *in_utf8 = "Shu E?ㄥ??" Turtle? ";Char *in_gb2312 = "Installing";Char Out[outlen];utf-8-->gb2312Codeconverter cc = Codeconverter ("Utf-8", "gb2312";Cc.convert (in_utf8,strlen (in_utf,out,outlen);cout << "utf-8-->gb2312 in=" << in_utf8 << "out=" << out << Endl;Gb2312-->utf-8Codeconverter CC2 = Codeconverter ("gb2312", "Utf-8";Cc2.convert (in_gb2312,strlen (in_gb2312), Out,outlen);cout << "Gb2312-->utf-8 in=" << in_gb2312 << "out=" << out << Endl;}Second, the use of iconv command encoding conversionWhen encoding and converting on Linux, you can either use the ICONV function family programming or use the ICONV command, except that the latter is for the file, and the specified file will be converted from one encoding to another.The Iconv command is used to convert the encoding of the specified file, the default output to the standard output device, or the output file.Usage: iconv[Options...][File...]The following options are available:Input/output format specification:-F,--from-code= name original text encoding-T,--to-code= name output encodingInformation:-L,--list enumeration of all known character setsOutput control:-C ignores invalid characters from output-O,--output=file output file-S,--silent off warning--verbose Printing Progress Information-?,--Help gives a list of the systems--usage gives a brief usage information-V,--version print program version numberExample:Iconv-f utf-8-T gb2312 aaa.txt >bbb.txtThis command reads the Aaa.txt file, converts from the Utf-8 encoding to gb2312 encoding, and its output is directed to the Bbb.txt file.Summary: Linux provides us with a powerful code conversion tool that brings us convenience.GLIBC with a set of transcoding functions Iconv, easy to use, can identify a lot of code, if the program needs to involve the conversion between the encoding, you can consider using it.The use of the Iconv command.$ iconv--list # display recognizable encoding name$ iconv-f gb2312-t UTF-8 a.html > b.html # Convert GB2312 encoded file a.html encode UTF-8, deposit b.html$ iconv-f gb2312-t BIG5 a.html > b.html # Convert GB2312 encoded file a.html encode BIG5, deposit b.htmlIconv programming involves the following calls to the GLIBC library:#include <iconv.h>Iconv_t Iconv_open (const char *tocode, const char *fromcode);int Iconv_close (iconv_t CD);Size_t Iconv (iconv_t Cd,char **inbuf, size_t *inbytesleft,char **outbuf, size_t *outbytesleft); When using Iconv transcoding, first use Iconv_open to get the transcoding handle, Then call Iconv transcoding, and then call Iconv_close to close the handle after the switch is finished. Iconv function: The parameter CD is the transcoding handle returned with the Iconv_open call, the parameter inbuf points to the buffer to be transcoded, the parameter inbytesleft is the number of bytes saved by inbuf that need to be transcoded, and the parameter outbuf the transcoding result The parameter outbytesleft the size of the outbuf space. If the call succeeds, ICONV returns the number of bytes converted (the number of bytes of the irreversible call, which is not included in the number of bytes that can be reversed). Otherwise return-1, and set the corresponding errno. Iconv gradually scan the inbuf, each conversion of a character, increase the inbuf, reduce inbytesleft, and save the results outbuf, the result of bytes deposited outbytesleft. The following conditions will stop the scan and return: 1. The multibyte sequence is invalid, when errno points to eilseq,*inbuf the first invalid character; 2. There are bytes left in Inbuf that have not yet been converted, errno to einval;3. Outbuf space is not enough, errno for e2big;4. Normal conversion is complete. For the Iconv function, there are two other invocation conditions: 1. Inbuf or *inbuf for null,outbuf and *outbuf not Null,iconv will set the transition state to the initial state and save the conversion sequence to *outbuf. If outbuf space is insufficient, errno will be set to E2big, return (size_t) (-1); 2. Inbuf or *inbuf for Null,outbuf or *outbuf also for Null,iconv set the transition state to the initial state. The use of the Iconv command is convenient, but if you encounter problems during conversion, you will stop the conversion, sometimes we want to skip the sequence of bytes that cannot be converted. The following program can implement this function. /*** siconv.cpp-a Simple demostrate, the usage of Iconv calling** report bugs to [email protected]* July 15th , 2006*/#include <iconv.h> #include <stdio.h>#Include <string> #include <stdarg.h> #include <errno.h> #include <sys/types.h> #include <sys /stat.h> #include <unistd.h> #include <sys/mman.h> #ifdef debug#define TRACE (fmt, args ...) fprintf ( stderr, "%s:%s:%d:" FMT, __file__, __function__, __line__, # #args) #else # define TRACE (FMT, args ...) #endif # define convbuf_size 32767extern int errno;void print_err (const char *FMT, ...) {va_list Ap;va_start (AP, FMT); vfprintf (stderr, FMT, AP); Va_end (AP);} int print_out (const char* buf, size_t num) {if (num! = fwrite (buf, 1, num, stdout)) {return-1;} return 0;} void Print_usage () {print_err ("usage:siconv-f encoding-t encoding [-c]" "Input-file\n";} int iconv_string (const char *from, const char *to, const char *SRC, size_t len,::std::string& result, int c = 0, size _t buf_size = +) {iconv_t Cd;char *pinbuf = const_cast< char* > (src); size_t inbytesleft = Len;char *poutbuf = NULL; size_t Outbytesleft = Buf_size;char *DST = null;size_t Retbytes = 0;int done = 0;iNT Errno_save = 0;if ((iconv_t)-1 = = (cd = Iconv_open (To))) {return-1;} DST = new Char[buf_size];while (Inbytesleft > 0 &&!done) {poutbuf = DST; outbytesleft = buf_size; TRACE ("target-in:%p pin:%p left:%d\n", SRC, pinbuf, inbytesleft); Retbytes = Iconv (CD, &pinbuf, &inbytesleft, &poutbuf, &outbytesleft); Errno_save = errno; if (DST! = poutbuf) {//We have something to write TRACE ("ok-in:%p pin:%p left:%d done:%d buf:%d\n", SRC, pinbuf, inbyte Sleft, PINBUF-SRC, POUTBUF-DST); Result.append (DST, POUTBUF-DST); } if (retbytes! = (size_t)-1) {poutbuf = DST; outbytesleft = buf_size; (void) iconv (CD, NULL, NULL, &POUTBUF, &outbytesleft); if (DST! = poutbuf) {//We have something to write TRACE ("ok-in:%p pin:%p left:%d done:%d buf:%d\n", SRC, pinbuf, inbyte Sleft, PINBUF-SRC, POUTBUF-DST); Result.append (DST, POUTBUF-DST); } errno_save = 0; Break } TRACE ("fail-in:%p pin:%p left:%d done:%d buf:%d\n", SRC, Pinbuf, Inbytesleft, PINBUF-SRC, poutbuf-DST); Switch (errno_save) {case E2big:trace ("E e2big\n", Break, Case Eilseq:trace ("e eilseq\n"; if (c) {errno_save = 0; inbyt Esleft = len-(PINBUF-SRC); Forward one illegal byte inbytesleft--; pinbuf++; Break } done = 1; Break Case Einval:trace ("E einval\n"; do = 1; break; Default:trace ("E-unknown:[%d]%s\n", errno, Strerror (errno)); done = 1; Break }}delete[] Dst;iconv_close (CD); errno = Errno_save;return (errno_save)? -1:0;} int conv_file_fd (const char* from, const char *to, int fd,::std::string& result, int c) {struct stat st;void *start;if (0! = Fstat (fd, &st)) {return-1;} Start = Mmap (NULL, St.st_size, Prot_read, map_shared, FD, 0); if (map_failed = = start) {return-1;} if (iconv_string (from, To, (char*) Start, st.st_size, result, C, convbuf_size) < 0) {int errno_save = errno; Munmap (STA RT, St.st_size); TRACE ("\ n"; errno = Errno_save; return-1;} Munmap (start, st.st_size); return 0;} int Conv_file (const char* from, const char* to, const char* filename, int c) {:: std::string result; FILE *fp;if (NULL = = (Fp=fopen (filename, "RB")) {Print_err ("Open File%s:[%d]%s\n", filename, errno, strerror (errno)); re turn-1;} if (CONV_FILE_FD (from, To, Fileno (FP), result, C) < 0) {Print_err ("conv file fd:[%d]%s\n", errno, Strerror (errno)); FC Lose (FP); return-1;} Print_out (Result.data (), result.size ()); fclose (FP); return 0;} int main (int argc, char *argv[]) {#ifdef testcase::std::string stra = "Welcome (welcome ^_^) came to the capital Beijing. ";:: std::string StrB =" Yell: We are Chinese <=> we're all Chinese. ";:: std::string strc = stra.substr (0, +) + strb.substr (0);:: std::string result;if (iconv_string (" GBK "," UTF-8 ", STRC . Data (), Strc.size (), result,1) < 0) {TRACE ("ERROR [%d]%s\n", errno, Strerror (errno));} TRACE ("CONVERSION RESULT:"; result.append ("\ n";p rint_out (Result.data (), result.size ()); return 0; #else:: std::string From, to;::std::string input_file;int o;int c = 0;while ( -1! = (c = getopt (argc, argv, "F:t:c"))) {switch (c) {case ' F ': F rom = optarg; Break Case ' t ': to = OptaRg Break Case ' C ': c = 1; Break default:return-1; }}if (From.empty () | | to.empty () | | Optind! = (argc-1)) {print_usage (); return-1;} Input_file = Argv[optind++];return Conv_file (from.c_str (), To.c_str (), Input_file.c_str (), c); #endif} You can use a memory image file to resolve a situation where the file is too large for memory buffering. With respect to the ICONV command, add the-C option to ignore issues that may arise during the conversion process. $ g++-O siconv siconv.cpp if the-ddebug option is added to the command line, the debug statement is compiled, and if the-dtestcase option is added, only the condition of the iconv_string function test is compiled.
UTF8 to gb2312 function