Use the iconv command to easily convert character set encoding in linux-general Linux technology-Linux programming and kernel information. For more information, see the following. Brother Lang called me last night and said that he opened the result file with garbled characters and asked me if I didn't handle the character encoding. Early in the morning to the lab, ask the next classmate, only to know the linux shell configuration file in the default character set encoding for UTF-8. UTF-8 is a unicode expression, gb2312 and unicode are character encoding, so the concept of gb2312 and UTF-8 should not be a hierarchical. In LINUX, you can use the iconv command to convert the specified file from one encoding to another.
The iconv command usage is as follows:
Iconv [option...] [file...]
The following options are available:
Input/output format specifications:
-F, -- from-code = Name Original Text Encoding
-T, -- to-code = Name output Encoding
Information:
-L, -- list lists all known character sets
Output Control:
-C: Ignore invalid characters from the output
-O, -- output = FILE: output FILE
-S, -- silent close warning
-- Verbose prints the progress information
Therefore, I added a sentence at the end of the program.
Iconv-f UTF-8-t gb2312/server_test/reports/software_.txt>/server_test/reports/software_asserts.txt
Solved the problem.
I checked some information on the Internet and learned that iconv function family programming can be used for encoding and conversion in LINUX.
CODE: the header file of the iconv function family is iconv. h, which must be included before use. # Include The iconv function family has three functions. The prototype is as follows: (1) iconv_t iconv_open (const char * tocode, const char * fromcode ); This function indicates which two types of encoding are to be converted. tocode is the target encoding and fromcode is the original encoding. This function returns a conversion handle for the following two functions. (2) size_t iconv (iconv_t cd, char ** inbuf, size_t * inbytesleft, char ** outbuf, size_t * outbytesleft ); This function reads characters from inbuf and outputs the converted characters to outbuf. inbytesleft records the number of characters that have not been converted, and outbytesleft records the remaining space of the output buffer. (3) int iconv_close (iconv_t cd ); This function is used to close the conversion handle and release resources. Example 1: A conversion example program implemented in C Language /* F. c: Code Conversion example C program */ # Include # Deprecision OUTLEN 255 Main () { Char * in_utf8 = "e? Why ?? Why? "; Char * in_gb2312 = "installing "; Char out [OUTLEN]; // Convert unicode code to gb2312 code Rc = u2g (in_utf8, strlen (in_utf8), out, OUTLEN ); Printf ("unicode --> gb2312 out = % sn", out ); // Convert the gb2312 code to the unicode code Rc = g2u (in_gb2312, strlen (in_gb2312), out, OUTLEN ); Printf ("gb2312 --> unicode out = % sn", out ); } // Code Conversion: Convert from one encoding to another Int code_convert (char * from_charset, char * to_charset, char * inbuf, int inlen, char * outbuf, int outlen) { Iconv_t cd; Int rc; Char ** pin = & inbuf; Char ** pout = & outbuf; Cd = iconv_open (to_charset, from_charset ); If (cd = 0) return-1; Memset (outbuf, 0, outlen ); If (iconv (cd, pin, & inlen, pout, & outlen) =-1) return-1; Iconv_close (cd ); Return 0; } // Convert UNICODE code to GB2312 code Int u2g (char * inbuf, int inlen, char * outbuf, int outlen) { Return code_convert ("UTF-8", "gb2312", inbuf, inlen, outbuf, outlen ); } // Convert the GB2312 code to the UNICODE code Int g2u (char * inbuf, size_t inlen, char * outbuf, size_t outlen) { Return code_convert ("gb2312", "UTF-8", inbuf, inlen, outbuf, outlen ); } Example 2: A conversion example program in C ++ /* F. cpp: Code Conversion example C ++ Program */ # Include # Include # Deprecision OUTLEN 255 Using namespace std; // Code conversion operation class Class CodeConverter { Private: Iconv_t cd; Public: // Construct CodeConverter (const char * from_charset, const char * to_charset ){ Cd = iconv_open (to_charset, from_charset ); } // Structure ~ CodeConverter (){ Iconv_close (cd ); } // Conversion output Int convert (char * inbuf, int inlen, char * outbuf, int outlen ){ Char ** pin = & inbuf; Char ** pout = & outbuf; Memset (outbuf, 0, outlen ); Return iconv (cd, pin, (size_t *) & inlen, pout, (size_t *) & outlen ); } }; Int main (int argc, char ** argv) { Char * in_utf8 = "e? Why ?? Why? "; Char * in_gb2312 = "installing "; Char out [OUTLEN]; // UTF-8 --> gb2312 CodeConverter cc = CodeConverter ("UTF-8", "gb2312 "); Cc. convert (in_utf8, strlen (in_utf8), out, OUTLEN ); Cout <"UTF-8 --> gb2312 in =" <in_utf8 <", out =" <out <endl; // Gb2312 --> UTF-8 CodeConverter cc2 = CodeConverter ("gb2312", "UTF-8 "); Cc2.convert (in_gb2312, strlen (in_gb2312), out, OUTLEN ); Cout <"gb2312 --> UTF-8 in =" <in_gb2312 <", out =" <out <endl; } |