Then the idea of the previous section said that a webpage to be able to display correctly in the browser, you need to maintain the same code in three places: Web page file, Web page encoding declaration and browser coding settings.
The first is the encoding of the page file itself, that is, what code is used to save the Web page file when it is created. It all depends on what encoding the person who created the page uses to save, and further depends on the operating system the person is using. For example, we use the Chinese version of the Windows XP system, when you create a text file, write some content, and press Ctrl+s to save the moment, the operating system for you to use the GBK code to save the file (no use of UTF-8, also did not use UTF-16). With the English system, the system uses ISO-8859-1 to save, which means that if you enter a Chinese character in a file in an English system, you cannot save it (of course, you can't even enter them).
A common misconception is that when creating an XML file (which is rarely done when creating HTML), it is thought that the file will be saved as a UTF-8 format as long as the UTF-8 is declared in the encoding part of the page. This is really ... How to say, can not blame everyone. In fact, the encoding part of the XML file is the same as in the CharSet in the HTML file, just telling the "others" (this person may be browsing your page, may be a browser, or it may be a program that processes your page, and others need to know this, because unless you tell them, Otherwise, no one can guess what code you're using, only through the contents of the file can not determine what encoding used, this is true) what encoding this file uses, but the operating system does not respond, it will still be the default encoding to save the file (again, in our Chinese Windows XP system, Save with GBK). Is it true that the document was saved by encoding or CharSet? The answer is not necessarily!
For example, Sina's page "claims" that he was saved with the GB2312 code, but in fact it is GBK, there are countless two knife programmers with the system default GBK save their XML files, but in their encoding vowed that is UTF-8.
This is what we call the second position, the encoding in the page encoding declaration should be the same as the encoding used when the Web page file was saved.
and the browser's coding settings are actually not strict, as we said in the third section, in the browser choose to use GB2312 to view, it will actually still use GBK. And the browser also has the good habit, that it will try to guess what to use the code to see the most appropriate.
I would like to reiterate that the encoding of the Web page file is consistent with the code stated in the Web page file, which is an excellent suggestion (worthy to follow, will be convenient with people, with their own convenience), but if not consistent, as long as the page file encoding and browser coding settings consistent, it can be correctly displayed.
For example, there is a page that uses GBK to save, but declares itself to be UTF-8. This time use browser to open it, first will see garbled, because this page "tell" browser with UTF-8 display, the browser will very respect this hint, so garbled piece. But when the browser is manually set to GBK, the display is normal.
Said above four knots so much, we will come to regaling Java in the character encoding, you will find interesting and scratching a lot of things, but once good grasp, invincible (but not as good as the Oriental invincible).