From the XML file garbled problem, explore the principle behind it (Zhuan)

Source: Internet
Author: User

There is a scenario where the application reads garbled XML files:

Add XML file to <?xml version= "1.0" encoding= "utf-8"?> format, if the XML file is modified, it contains the contents of Chinese characters, save as other formats (such as Unicod,ansi) and so on, The newly saved configuration file, the program reads the time will appear garbled, cannot read normally.

Verify the following methods:

(1), you can drag and drop the XML file on IE browser, the XML file will not be properly rendered on the browser.

(2), open the XML file through Visual Studio, there will be a loading format error!

See Address: http://blog.csdn.net/dinglang_2009/article/details/6895355

In daily development work, we often use XML, which has long been a standard. Its use is very extensive, but these are not the focus of this article discussed.

I believe that everyone in the beginning often encountered the "garbled" problem, which is a very headache for Chinese programmers. I have always wanted to delve deeply into the principle of "coding", but the level of helplessness is limited, those boring theories (binary, ascii,unicode,utf-8,gb2312,iso ...). Light these let me see two eyes black, really can't see, also difficult to really understand understand. I hope you have more tips ...

I will use the work encountered in an "XML file garbled" Simple problem, solve the problem, analyze the principle behind it.

First, we create a new text file locally, changing the suffix name to ". XML ", then use Notepad to open it and add some content that conforms to the XML document specification. :

After writing, press "Ctrl+s" to save, then use IE browser to open the XML file, verify the specification and correctness of the XML document. Unexpectedly, actually parse the error, as follows:

What is this? The format of my XML document definition seems to be fine. Invalid character? This is certainly a typical "coding" problem. Smart I think of the first, adjust the "code" IE browser.

But open "view" "encoding", found that the encoding format is all gray, it seems that can not choose Oh. This is because, when defining an XML document, the encoding format is specified as "UTF-8", which is equivalent to telling the browser (XML parsing engine) that you must use "UTF-8" encoding to parse me, so you cannot use the other encoding format to view it.

This is because, when using Notepad to save the document, we did not choose the encoding format, the default is the operating system encoding (Chinese version of the system), that is, the corresponding "GB2312" encoding. When our IE browser, and then use our designated UTF-8 encoding to parse the XML document, there is garbled, so caused by the above error. (Files in Windows are saved on your hard disk, using the operating system encoding by default.) For example, our XML document defined in the "China" the word, save, if its corresponding GB2312 may be "10001", and in UTF-8 encoding, "10001" corresponds to "China", or can not find, or garbled, so IE refused to show). So what should we do? There are two ways to solve this problem.

First, when we define the XML document, we specify that it is encoded as gb2312, as shown in:

After saving, we then use IE browser to open, the result

Congratulations, this problem has been solved. However, this method is not recommended for use. Since we are defining XML documents, we generally use UTF-8 encoding for the purpose of document commonality.

The second method:

We use Notepad to open the document, click "Save As", found below there will be "encoding" option, select "UTF-8" and then try again.

In fact, we are using development tools such as Eclipse or Microsoft Visual Studio to define XML documents without encountering the above problems. The reason is that these Ides are very "smart", and your XML document specifies the encoding format that the IDE automatically uses when it saves the XML document to the hard disk. So, a lot of programmers who are limited to some kind of IDE development don't really understand the knowledge and the rationale behind it, but they do it just as easily. In the early years, I understand that there are many domestic Daniel, writing code are used editplus such as text editor, and those in Linux/unix above the Daniel, many are encoded with Vi/vim. Maybe that's the difference.

From the XML file garbled problem, explore the principle behind it (Zhuan)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.