The default encoding is UTF-8, but after importing the GBK project, it is changed directly to Iso-8859-1, but it is still a coding error.
Used on-line:
Global encoding Settings: Method of encoding Settings: Toolbar-->window-->preferences-->general-->workspace-->text file encoding, set the appropriate encoding.
Local encoding settings: In the Source right button-->general-->editors-->test editors-->spelling-->encoding, here is to set the encoding of individual files.
Once the global encoding has been modified, the individual page has not changed.
Then right click, found no GBK option ... Assi.
GC comes, this time need to hand GBK to the TextBox , and then click Apply, so that the page encoding has become GBK.
Attached: What is the difference between Unicode, UTF-8, and iso8859-1?
(This part is transferred from http://blog.csdn.net/xiongchao2011/article/details/7276834)
Will take "Chinese" two words as an example, by looking at the table can know its GB2312 code is "d6d0 CEC4", Unicode Encoding "4e2d 6587", UTF code is "E4b8ad e69687". Attention
These two words are not iso8859-1 encoded, but can be "represented" by iso8859-1 encoding.
2. Basic knowledge of coding
The earliest encoding is iso8859-1, similar to ASCII encoding. However, in order to facilitate the presentation of a variety of languages, there are a number of standard coding, the following are important.
2.1.iso8859-1 is usually called Latin-1 .
Belong tosingle-byte encoding, the maximum range of characters that can be represented is 0-255, which is applied to the English series. For example, the letter A is encoded as 0x61=97.
It is clear that the iso8859-1 encoding represents a narrow character range and cannot represent Chinese characters. However, because it is a single-byte encoding, and the computer is the most basic representation unit consistent, so many times,
Still expressed using ISO8859-1 encoding. And on many protocols, the code is used by default. For example, although "Chinese" two words do not exist iso8859-1 encoding, take gb2312 encoding as an example, should
This is "d6d0 cec4" two characters, when using iso8859-1 encoding, it is opened to 4 bytes to represent: "D6 d0 ce C4" (in fact, in the case of storage, it is also in bytes
The unit is processed). In the case of UTF encoding, it is 6 bytes "E4 B8 ad E6 96 87". Obviously, this representation needs to be based on another encoding.
2.2.GB2312/GBK
This is the man's.GB code, specifically used to denote Chinese characters, is a double-byte encoding, while the English alphabet and iso8859-1 are consistent (compatible with ISO8859-1 encoding). where GBK encoding can be used to simultaneously represent
Traditional and simplified characters, while gb2312 can only express simplified characters, GBK is compatible with GB2312 encoding.
2.3.Unicode
This isThe most uniform encodingThat can be used to represent characters in all languages, and isfixed-length double-byte(There are also four-byte) encodings, including the English alphabet. So it can be said that it is incompatible iso8859-1
Code, nor is it compatible with any code. However, compared to the iso8859-1 encoding, the Uniocode encoding only adds a 0 byte to the front, such as the letter A is "00 61".
It is important to note that the fixed-length encoding is convenient for computer processing (note that GB2312/GBK is not a fixed-length encoding), and Unicode can be used to represent all characters, so Unicode is used internally in many software
Code to handle, such as Java.
2.4.UTF
Given that Unicode encoding is incompatible with ISO8859-1 encoding, it is easy to take up more space: Because Unicode also requires two bytes for the English alphabet. So Unicode is not easy to transmit and store
Storage. As a result, UTF encoding is generated, UTF encoding is compatible with ISO8859-1 encoding and can also be used to represent characters in all languages, however,UTF encoding is indeterminate compilationYards, each character's length from 1-6 words
Sections. Other than thatUTF code comes with a simple checksum function. In general, the English alphabet is expressed in one byte, while the characters use three bytes.
Note that although UTF is used in order to use less space, it is only relative to Unicode encoding,If you already know the kanji, then using GB2312/GBK is undoubtedly the most economical。 But the other
On the one hand, it is worth noting that although UTF encoding uses 3 bytes for Chinese characters,But even for kanji pages, UTF encoding is less likely to be saved than Unicode encoding because the page contains a lot of English characters。
MyEclipse Chinese encoding error, no GBK option