Coding problem in dom4j due to "Chinese problem not discussed"

Source: Internet
Author: User
This article mainly describes a Chinese problem that occurs when dom4j saves the document to a file. This article is the same as the article before 80 and has nothing to do with the spring Project, please be self-respecting and free from disturbing spring fans. If you have any shortcomings in this article, you are welcome to criticize and advise.
Dom4j is an excellent Java open-source XML parsing project that supports Dom, Sax and JAXP and provides powerful support for the XPath query language. Therefore, in many open source projects of the easyjf Team, such as easyjweb and easydbo, dom4j is used to process XML file-related operations.
 
1. Load a DOM into memory from an XML file:
Fileinputstream in = new fileinputstream (new file (filename ));
Saxreader reader = new saxreader ();
Doc = reader. Read (in ); 2. Write the data in the Dom to the XML file.
Using dom4j, it is very easy to write data from a Dom to a file. The API is as follows:
Public void write (writer) throws ioexception; therefore, you can simply use the following code to write a document to the C:/test. xml file:
Java. Io. Writer wR = new java. Io. filewrite (filename );
Doc. Write (WR );
Wr. Close (); // note that the close () method must be executed to implement real write operations.
  
This is also a very simple method recommended by dom4j. However, when our Dom contains Chinese character data, the XML document written in this method cannot be opened intuitively. The following error is prompted:
Org. dom4j. Merge entexception: Invalid byte 1 of 1-byte UTF-8 sequence (0xb2) nested exception: Invalid byte 1 of 1-byte UTF-8 sequence (0xb2)
At org. dom4j. Io. saxreader. Read (saxreader. Java: 484)
At org. dom4j. Io. saxreader. Read (saxreader. Java: 343)
At, we can see the generated XML file encoding. The content is UTF-8, but the file format is ANSI, as shown in:
  Cause Analysis:
Since filewriter's default output encoding is ANSI, And the content provided by the wirte method in dom4j is actually saved in UTF-8, XML files including Chinese characters cannot be read normally. Solution:
You cannot use simple filewriter. Instead, you must use a writer that can specify the specific output encoding. In the JDK Io package, outputstreamwriter can specify the output encoding.
The correct code is as follows:
Java. Io. outputstream out = new java. Io. fileoutputstream (filename );
Java. Io. Writer wR = new java. Io. outputstreamwriter (Out, "UTF-8 ");
Doc. Write (WR );
Wr. Close ();
Out. Close ();
To simplify the process, you can write the following style:
Java. Io. Writer wR = new java. Io. outputstreamwriter (New java. Io. fileoutputstream (filename), "UTF-8 ");
Doc. Write (WR );
Wr. Close (); Summary:
Because most of the excellent basic open-source projects are developed by foreigners, they are unlikely to perform tests on the Chinese platform, and use case data rarely uses the Chinese platform. Therefore, even if we follow the general instructions and user guides of these open-source projects, there will be many unpredictable errors. This is also why I am involved in the establishment of the open-source team easyjf, Advocate domestic open-source, and develop some basic open-source frameworks such as easyjweb and easydbo.
Of course, the Chinese question raised here is a Chinese question that is still "not discussed" and can be correctly run only through some rare measures. Therefore, it is also integrated into the "Chinese questions have not been discussed" series. (Note: The author of this article, easyjf open-source team Daxia, reposted the author's statement !)

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.