Qt Chinese garbled characters
For beginners of Linux, go to advanced QT programming. However, the first demoProgramI met Chinese garbled characters and sweated!
Environment:
1. RedHat As5
2. qt4.4.0
3. lang = "zh_cn.gb18030"
Program:
...
Qtextcodec: setcodecfortr (qtextcodec: codecforname ("gb18030 "));
...
Label. settext (qobject: TR ("the same world, the same dream! "));
...
Result:
The title is displayed normally. Chinese characters of labels and buttons are garbled.
Search online for 3 ~ 4 days, summarized as follows:
1. Use setdefacodecodec to set;
Qapp-> setdefacodecodec (qtextcodec: codecforname ("GBK "));
Qlabel * label = new qlabel (TR ("Chinese label "));
Unfortunately, setdefacodecodec is a function of qt3, which is no longer supported by qt4.
2. Set the encoding of the qobject member function tr;
Qtextcodec: setcodecfortr (qtextcodec: codecforname ("GBK "));
3. Use the fromlocal8bit () function of qstring;
Qstring STR;
STR = Str. fromlocal8bit ("HAHAHA ");
Hello. setwindowtitle (STR );
4. Use the tounicode method of qtextcodec to display Chinese characters.
Qlabel Hello (qobject: TR ("hello"). tolocal8bit ());
Qtextcodec * codec = qtextcodec: codecforlocale ();
Qstring A = codec-> tounicode ("Anshi manual ");
The above issues are not solved.
I still did not search for this question, and finally found the real answer in a blog: font.
"A strange phenomenon found in qt4 is that the title" I Am a dialog box "in the dialog box can be correctly displayed, and the button is a small square. It seems that some system settings are incorrect, not character encoding. If it is a character encoding problem, it should be garbled rather than a small square. I suddenly thought that the font was mentioned on the Internet. I think it can be explained that the title bar and button text are not the same font, but the button font is not, so it is a small square, rather than Garbled text.
I searched the default font used for qt4 on the Internet, but I did not find the default font. I checked qt4 again.Source code,CodeToo many, not found.
However, the setfont function caught my attention. I have seen this method on the Internet ."
Refer to the above content to modify the code:
...
Qtextcodec: setcodecfortr (qtextcodec: codecforname ("gb18030 "));
Qfont font ("Times", 12, qfont: Normal, false );
App. setfont (font );
...
Label. settext (qobject: TR ("the same world, the same dream! "));
...
Successful! Chinese display is normal!
The solution to this problem is really too important. To tell the truth, it is really a little exhausting. Fortunately, I didn't give up. Yeah!
Note: when using the QT designer, the interface is still garbled. Use qtconfig to set font to bitstream charter to solve the garbled problem. At this time, I think the original program is also OK? Unexpectedly, there was no Garbled text ~ Oh mygod!
Come from: http://www.linuxdiyf.com/viewarticle.php? Id = 97025
Other solutions:
Qtextcodec: codecforname ("GBK") returns a null pointer.
Qtextcodec * bianma = qtextcodec: codecforlocale (); // qtextcodec: codecforname ("GBK ");
Qt Chinese display favorites
<Type = "text/JavaScript"> <type = "text/JavaScript">
In QT, qtextcodec can be used to convert the character string encoding. This facilitates the development of Chinese software in QT. However, this method does not comply with international/local standards:
Code:
Char * string = "Hello, world! ";
Qtextcodec * codec = qtextcodec: codecforname ("GBK ");
// Qtextcodec * codec = qtextcodec: codecforname ("big5 ");
Qstring strtext = codec-> tounicode (string );
Qlabel * label = new qlabel (strtext );
The most direct method is to set the encoding of the entire application to GBK encoding, and then add tr before the string:
Code:
Qapp-> setdefacodecodec (qtextcodec: codecforname ("GBK "));
...
Qlabel * label = new qlabel (TR ("Hello, world! "));
Example:
Code:
# Include
# Include
# Include <qtextcodec. h> // File Header
Int main (INT argc, char * argv [])
{
Qapplication app (argc, argv );
App. setdefacodecodec (qtextcodec: codecforname ("GBK "));
Qlabel label (TR ("Hello, world! "), Null );
App. setmainwidget (& label );
Label. Show ();
Return app.exe C ();
}
If you want QT to obtain the character set based on locale environment variables, run the following command:
Qstring: fromlocal8bit ("Hello, world! ");
Example:
Code:
# Include
# Include
Int main (INT argc, char * argv [])
{
Qapplication app (argc, argv );
Qlabel label (qstring: fromlocal8bit ("Hello, world! "), Null );
App. setmainwidget (& label );
Label. Show ();
Return app.exe C ();
}
Bytes -------------------------------------------------------------------------------------------
Method 2:
Qtextcodec: setcodecfortr (qtextcodec: codecforlocale ());
Bytes -----------------------------------------------------------------------------------------
Method 3: Write a function
Qstring init_gbk (qstring S)
{
Qgbkcodec * GBK = (qgbkcodec *) qtextcodec: codecforname (\ "GBK \");
Return GBK-> tounicode (S. Latin1 (), S. Length ());
}
Just call it.
Qt Chinese File
Learning something is the right course. Reading 85 comments 0
Font size: large, medium, small
Although the C ++ standard has a class related to file reading, it is also very useful, but it is inconvenient to use it when it involves QT programming, many components of QT are associated with their own qstring strings. Therefore, it is not so convenient to use C ++'s own string type. Conversion is required, this brings complexity to the program and transformation overhead. Therefore, if you use QT for development, you can use the types it carries to process and form a system, it facilitates the interaction and sharing of data in the program.
QT is good, but you should pay attention to the encoding format when processing Chinese or other languages. If you do not pay attention to it, what you may read when reading files is garbled or the program will die, this is what we don't want to see. Let's talk about how to read Chinese files through the QT class.
Introduction
We need to use several classes in the header file:
# Include <qstring. h>
# Include <qfile. h>
# Include <qtextstream. h>
# Include <qtextcodec. h>
Qstring
The qstring class provides the abstraction of a unicode text and a string array of zero-ending characters in the Classic C.
Qstring uses implicit sharing, which makes it very efficient and easy to use.
All qstring methods use the const char * parameter. Const char * is interpreted as a zero-ending ASCII string in the Classic C style. Therefore, the const char * parameter 0 is valid. If const char * does not end with zero, the result is uncertain. The function that copies the Classic C string to qstring does not copy the ending 0 characters. The qchar array of qstring (which can be returned through Unicode () usually does not end with zero. If you want to pass the qstring to a zero-ending string that requires C, use Latin1 ().
The qstring that is not assigned anything is zero, that is, the length and Data Pointer are both 0. The qstring that references the Null String ("", a single '\ 0' character) is null. Both qstrings are valid in the method. Assign (const char *) 0 to qstring and give it a zero qstring. For convenience, qstring: NULL is a zero qstring. When sorting, the Null String is at the beginning, followed by a non-null string, followed by a zero string. We recommend that you use if (! Str. isnull () instead of IF (! Str) to detect non-zero strings. for explanations, refer to operator! ().
Note that if you find that you are using a mix of qcstring, qstring, and qbytearray, this will lead to a lot of unnecessary copies and may indicate that the actual natural data you are processing is uncertain. If the data end with zero octal digits, use qcstring. If the data does not end with (that is, it contains 0) octal digits, use qbytearray. If the data is text, use qstring.
You can use the qstringlist class to process the string list. You can use qstringlist: Split () to split a string into a string list, and you can use qstringlist: Join () concatenates a string list into a string with random separators. You can also use qstringlist: grep () to obtain a list of character strings from a string list or strings that match a specific RegEx.
//////////////////////////////////////// //////////////////////////////////////// //////////////////////////////////////// ////////////////////////////////////////
Qfile
Qfile class is an operationCompositionInput/output device.
Qfile is an input/output device used to read/write binary files and text files. Qfile can be used independently, but it is more convenient to use it with qdatastream or qtextstream.
The file name can usually be passed through the constructor, but it can also be set using setname. You can use exists () to check whether a file exists and remove () to remove it.
Files can be opened with open (), closed with close (), and refreshed with flush. Data can be read and written using qdatastream or qtextstream, but you can also use readblock () and Readline () to read and write using writeblock. Qfile also supports getch (), ungetch (), and putch ().
Size () can return the file size. You can use the at () function to obtain the current file location or move it to a new file location. If you reach the end of the file, atend () returns true. Handle () returns the file handle.
Here is a code segment that uses qtextstream to read a text file one row at a time. It will print each row with a line number.
Qstringlist lines;
Qfile file ("file.txt ");
If (file. Open (io_readonly )){
Qtextstream stream (& file );
Qstring line;
Int n = 1;
While (! Stream. EOF ()){
Line = stream. Readline (); // does not include a line of "\ n"
Printf ("% 3d: % s \ n", N ++, line. Latin1 ());
Lines + = line;
}
File. Close ();
}
It is also easy to write text (assuming we have a string list of rows to write ):
Qfile file ("file.txt ");
If (file. Open (io_writeonly )){
Qtextstream stream (& file );
For (qstringlist: iterator it = lines. Begin (); it! = Lines. End (); ++ it)
Stream <* It <"\ n ";
File. Close ();
}
//////////////////////////////////////// //////////////////////////////////////// //////////////////////////////////////// /////////
Qtextstream
The qtextstream class provides basic functions for reading and writing text using qiodevice.
The functional interface of the text stream class is very similar to the Standard C ++ iostream class. The difference between iostream and qtextstream is that our stream operations are on a qiodevice that is easily inherited, while iostream can operate only one file * pointer that cannot be inherited.
Qt provides several global functions similar to iostream:
Bin sets qtextstream to read/write binary numbers
Oct sets qtextstream to read/write Octal numbers
Set qtextstream in Dec to read/write decimal numbers
Hex sets qtextstream to read/write hexadecimal numbers
Endl force line feed
Flush force qiodevice to refresh any cached data
WS acts as any available controller (when input)
Reset to reset qtextstream as its default mode (see reset ())
Qsetw (INT) sets the field width as the specified parameter
Qsetfill (INT) sets the padding character as the specified parameter
Qsetprecision (INT) sets the precision as the specified parameter
Warning by default, qtextstream automatically checks whether the number in the stream is in decimal, octal, hexadecimal, or binary format when reading the stream. Specifically, a number starting with "0" is octal. For example, if the order is "0100", it will be interpreted as 64.
Qtextstream reads and writes text, which is not suitable for processing binary data (while qdatastream is suitable ).
By default, Unicode text (such as qstring) is output after 8-bit local encoding ). You can use the setencoding () method to change these settings. For input, qtextstream automatically detects the standard Unicode "byte sequence mark" text file, otherwise it uses local 8-bit encoding.
Qiodevice is set in the constructor or later used in setdevice. If the input reaches atend (), the return value is true. You can use operator> () to reload the data to read a variable of the appropriate type, or use read () to read it into a single string as the entire part, or use Readline () read a row at a time. You can use skipwhitespace () to ignore the control operator. You can use flags () or SETF () to set the stream tag. This stream also supports width (), precision (), and fill (). You can use reset () to restore the default settings.
You can also refer to qdatastream, input/output, and network and text-related classes.
//////////////////////////////////////// //////////////////////////////////////// //////////////////////////////////////// /////////
Qtextcodec
The qtextcodec class provides conversion between text encodings.
Qt uses Unicode to store, draw, and operate strings. In many cases, you may want to use different encoding methods to process data. For example, most Japanese files are stored in shift-JIS or iso2022 files, while Russian users often use KOI8-R or cp1251 encoding. Qt provides a set of qtextcodec classes to convert unicode format to the corresponding format.
//////////////////////////////////////// //////////////////////////////////////// //////////////////////////////////////// /////////
Code Section
# Include <qstring. h>
# Include <qfile. h>
# Include <qtextstream. h>
# Include <qtextcodec. h>
Int main ()
{
Qfile file ("test.txt ");
If (file. Open (io_readonly | io_raw ))
{
Qtextstream flostream (& file );
Qstring line;
Qtextcodec * codec = qtextcodec: codecforname ("GBK ");
Flostream. setcodec (codec );
While (flostream. atend () = 0)
{
Line = codec-> fromunicode (flostream. Readline ());
Qwarning (line );
}
File. Close ();
}
Return 0;
}
The main changes in the Code are the yellow background
This means to create a Chinese GBK encoding style and recode the file stream in this way, so that the Chinese can be output smoothly. If you don't believe it, try it.
After adding these two sentences, you can display Chinese characters.
# Include <qtextcodec. h>
Qtextcodec: setcodecfortr (qtextcodec: codecforname ("gb2312 "));
Let's talk about the Chinese encoding of QT ~
Source: chinaunix blog Date: 2009.07.04 (1 comment in total) I want to comment
Original article:
Http://www.cuteqt.com/blog? P = 531
Why cannot I display Chinese characters? In QT usage, we often encounter such problems.
After Google search, you will find someone has solved the problem. It is nothing more than resetting the default encoder.
First, call one of the following two functions:
Qtextcodec * textc = qtextcodec: codecforname ("GBK ");
Qtextcodec * textc = qtextcodec: codecforname ("utf8 ″);
Then, call one of the following three functions:
Qtextcodec: setcodecforcstrings (textc );
Qtextcodec: setcodecfortr (textc );
Qtextcodec: setcodecforlocale (textc );
I don't know which one to call, so I have to try it and there are not many combinations. The problem is solved. But try again next time.
Next let's take a look at where these functions work.
1. setcodecforcstrings (textc)
This function is mainly used to construct a qstring object using a character constant or qbytearray. For example, the following example
23 int main (INT argc, char * argv []) {
24 qapplication app (argc, argv );
25
26 qtextcodec * Tc = qtextcodec: codecforname ("utf8 ″);
27 qtextcodec: setcodecforcstrings (TC );
28 qstring STR ("I 'd rather see two demons in tug-of-war than see an angel dancing .");
29 qpushbutton ww (STR );
30 ww. Show ();
31
32 app.exe C ();
33}
If you comment out lines 26 and 27, we will see a bunch of garbled characters on qpushbutton. By default, qstring treats the string parameter of the constructor as a standard Latin character, which is obviously garbled. If it is in Linux, When you input it in the editor, the default text format is "utf8 ″, if you are developing in windows, the 26 rows may be changed to "GBK.
2. setcodecfortr (textc)
This function is used to set the default string encoding when it is passed to the tr function.
23 int main (INT argc, char * argv []) {
24 qapplication app (argc, argv );
25
26 qtextcodec * Tc = qtextcodec: codecforname ("utf8 ″);
27 qtextcodec: setcodecforcstrings (TC );
28 qstring STR (qobject: TR ("I 'd rather see two demons in tug-of-war than see an angel dancing ."));
29 qpushbutton ww (STR );
30 ww. Show ();
31
32 app.exe C ();
33}
In the same example, after TR is added to the 28 rows, the running results cannot display Chinese characters normally. You need to change the 27-line function to setcodecfortr.
3. qtextcodec: setcodecforlocale (textc)
This function is mainly used to set and read the default encoding format for the local file system. For example, the encoding format when reading a file through a stream. Or you can use qdebug () to output the encoding when printing information.
7 int main (INT argc, char * argv []) {
8 qtextcodec * Tc = qtextcodec: codecforname ("utf8 ″);
9 qtextcodec: setcodecforcstrings (TC );
10 qstring STR ("I 'd rather see two demons in tug-of-war than see an angel dancing .");
11
12 qtextcodec * TL = qtextcodec: codecforname ("utf8 ″);
13 qtextcodec: setcodecforlocale (TL );
14 qdebug ()
Qt: conversion between ANSI, Unicode, and utf8 strings and writing to text files
From: http://www.blogjava.net/Yipak/articles/227015.html
ANSI string we are most familiar with, English occupies one byte, Chinese characters 2 bytes, ending with a \ 0, commonly used in TXT text files
Unicode string. Each character (Chinese character or English letter) occupies two bytes and ends with two consecutive \ 0 characters. This string is used by the NT operating system kernel, it is often defined as typedef unsigned short wchar_t; so we often see errors such as char * cannot be converted to unsigned short *, which is actually Unicode
Utf8 is a form of Unicode compression. English A is expressed as 0x0041 in Unicode. foreigners think this storage method is too wasteful because it wastes 50% of space, therefore, the English language is compressed into one byte, Which is UTF-8 encoded. However, Chinese characters occupy three bytes in utf8, which is obviously not as cost-effective as Chinese characters, this is why Chinese Web pages are commonly used for utf8 encoding while foreigners use it for ANSI encoding.
Utf8 is widely used in games, such as wow Lua scripts.
Next, let's take a look at the conversion, mainly using code to describe it.
I used the cfile class for file writing. In fact, the same is true for file *. Writing a file has nothing to do with the category of the string. The hardware only cares about the data and length.
ANSI to Unicode
Two methods are introduced.
Void cconvertdlg: onbnclickedbuttonansitounicode ()
{
// ANSI to Unicode
Char * szansi = "abcd1234 you and me ";
// Pre-convert to get the size of the required space
Int wcslen =: multibytetowidechar (cp_acp, null, szansi, strlen (szansi), null, 0 );
// Leave a space for '\ 0' to allocate space. multibytetowidechar does not give' \ 0' Space
Wchar_t * wszstring = new wchar_t [wcslen + 1];
// Conversion
: Multibytetowidechar (cp_acp, null, szansi, strlen (szansi), wszstring, wcslen );
// Add '\ 0' at the end'
Wszstring [wcslen] = '\ 0 ';
// Unicode MessageBox API
: Messageboxw (getsafehwnd (), wszstring, wszstring, mb_ OK );
// Write the following text
// Write a text file. The first two bytes are 0 xfeff, and the low 0xff is written before
Cfile;
Cfile. Open (_ T ("1.txt"), cfile: modewrite | cfile: modecreate );
// Starts with a file
Cfile. seektobegin ();
Cfile. Write ("\ xFF \ xfe", 2 );
// Write content
Cfile. Write (wszstring, wcslen * sizeof (wchar_t ));
Cfile. Flush ();
Cfile. Close ();
Delete [] wszstring;
Wszstring = NULL;
// Method 2
// Set the current region information. If this method is not set, Chinese characters are not displayed correctly.
// Required # include <locale. h>
Setlocale (lc_ctype, "CHS ");
Wchar_t wcsstr [100];
// Note that the following is an uppercase string in Unicode.
// Swprintf is the Unicode version of sprintf
// L must be added before the format, which indicates Unicode.
Swprintf (wcsstr, l "% s", szansi );
: Messageboxw (getsafehwnd (), wcsstr, wcsstr, mb_ OK );
}
Unicode to ANSI
There are also two methods
Void cconvertdlg: onbnclickedbuttonunicodetoansi ()
{
// Unicode to ANSI
Wchar_t * wszstring = l "abcd1234 you and me ";
// Pre-convert to get the size of the required space. The function used this time is opposite to the above name
Int ansilen =: widechartomultibyte (cp_acp, null, wszstring, wcslen (wszstring), null, 0, null, null );
// Same as above, the allocated space should be reserved for '\ 0'
Char * szansi = new char [ansilen + 1];
// Conversion
// The strlen for Unicode is wcslen.
: Widechartomultibyte (cp_acp, null, wszstring, wcslen (wszstring), szansi, ansilen, null, null );
// Add '\ 0' at the end'
Szansi [ansilen] = '\ 0 ';
// ANSI MessageBox API
: Messageboxa (getsafehwnd (), szansi, szansi, mb_ OK );
// Write the following text
// Write a text file. The ANSI file does not contain Bom.
Cfile;
Cfile. Open (_ T ("1.txt"), cfile: modewrite | cfile: modecreate );
// Starts with a file
Cfile. seektobegin ();
// Write content
Cfile. Write (szansi, ansilen * sizeof (char ));
Cfile. Flush ();
Cfile. Close ();
Delete [] szansi;
Szansi = NULL;
// Method 2
// There is another method like above
Setlocale (lc_ctype, "CHS ");
Char szstr [100];
// Note that the following are uppercase letters. In ANSI, Unicode strings are followed.
// Sprintf
Sprintf (szstr, "% s", wszstring );
: Messageboxa (getsafehwnd (), szstr, szstr, mb_ OK );
}
Unicode to utf8
Void cconvertdlg: onbnclickedbuttonunicodetou8 ()
{
// Unicode to utf8
Wchar_t * wszstring = l "abcd1234 you and me ";
// Pre-convert to get the size of the required space. The function used this time is opposite to the above name
Int u8len =: widechartomultibyte (cp_utf8, null, wszstring, wcslen (wszstring), null, 0, null, null );
// Same as above, the allocated space should be reserved for '\ 0'
// Although utf8 is a unicode compression format, it is also a multi-byte string, so it can be saved as char
Char * szu8 = new char [u8len + 1];
// Conversion
// The strlen for Unicode is wcslen.
: Widechartomultibyte (cp_utf8, null, wszstring, wcslen (wszstring), szu8, u8len, null, null );
// Add '\ 0' at the end'
Szu8 [u8len] = '\ 0 ';
// MessageBox does not support utf8, so only files can be written.
// Write the following text
// Write a text file. The BOM of utf8 is 0 xbfbbef.
Cfile;
Cfile. Open (_ T ("1.txt"), cfile: modewrite | cfile: modecreate );
// Starts with a file
Cfile. seektobegin ();
// Write the BOM, which is the same as the previous one
Cfile. Write ("\ XeF \ xbb \ xbf", 3 );
// Write content
Cfile. Write (szu8, u8len * sizeof (char ));
Cfile. Flush ();
Cfile. Close ();
Delete [] szu8;
Szu8 = NULL;
}
Utf8 to Unicode
Void cconvertdlg: onbnclickedbuttonu8tounicode ()
{
// Utf8 to Unicode
// Because Chinese characters are directly copied as garbled characters, the compiler sometimes reports errors. Therefore, the hexadecimal format is used.
Char * szu8 = "abcd1234 \ xe4 \ xbd \ xa0 \ xe6 \ x88 \ x91 \ xe4 \ xbb \ x96 \ x00 ";
// Pre-convert to get the size of the required space
Int wcslen =: multibytetowidechar (cp_utf8, null, szu8, strlen (szu8), null, 0 );
// Leave a space for '\ 0' to allocate space. multibytetowidechar does not give' \ 0' Space
Wchar_t * wszstring = new wchar_t [wcslen + 1];
// Conversion
: Multibytetowidechar (cp_utf8, null, szu8, strlen (szu8), wszstring, wcslen );
// Add '\ 0' at the end'
Wszstring [wcslen] = '\ 0 ';
// Unicode MessageBox API
: Messageboxw (getsafehwnd (), wszstring, wszstring, mb_ OK );
// Write the same text as ANSI to Unicode
} ANSI utf8 and utf8 conversion ANSI is the combination of the above two. Unicode is used as the intermediate amount and can be converted twice.
This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/wangqis/archive/2009/09/22/4577712.aspx