Dom4j coding Solution

Source: Internet
Author: User
Dom4j coding Solution

These days began to learn dom4j, find an article on the Internet to open the dry, get started very fast, but found a problem is unable to save XML files in UTF-8, when read again after saving the message "invalid byte 2 of 2-byte UTF-8 sequence. this error occurs. Check that the file generated by dom4j is garbled in any editor that correctly processes XML encoding, no garbled characters are displayed in the notepad. Chinese characters are displayed correctly. It gave me a headache. The XML file generated using GBK and gb2312 encoding can be parsed normally. Therefore, the suspected dom4j does not process UTF-8 encoding. Then you can view the original dom4j code. The problem finally found is the problem of your own program.
The example of dom4j is similar to the Code for creating a new XML document in the popular dom4j usage introduction on the Internet.

Document Doc = org. dom4j. incluenthelper. createdocument (); element root = Doc. addelement ("book"); root. addattribute ("name", "my books"); element childtmp; childtmp = root. addelement ("price"); childtmp. settext ("21.22"); element writer = root. addelement ("author"); writer. settext ("Li Si"); writer. addattribute ("ID", "001 "); Try{Org. dom4j. Io. xmlwriter = NewOrg. dom4j. Io. xmlwriter ( NewFilewriter (filename)); Xmlwriter. Write (DOC); xmlwriter. Close ();} Catch(Exception e) {system. Out. println (e );}}

The output in the above Code uses the filewriter object for file output. This is why the file encoding cannot be performed correctly. The subclass inherited by the writer class in Java does not provide the encoding format, so dom4j cannot process the output file in the correct format. At this time, the saved files will be saved with the system's default encoding. in the Chinese version of window, Java's default encoding is GBK, that is, although we have identified the need to save XML in UTF-8 format, but in fact the file is saved in GBK format, so that is why we can use GBK, gb2312 encoding to generate xml files can be correctly parsed, and files generated in UTF-8 format cannot be parsed by the XML parser.
Now we have found the cause. Let's find a solution. First, let's take a look at how dom4j implements encoding.
);PublicXmlwriter (outputstream out)ThrowsUnsupportedencodingexception {// system. Out. println ("in outputstream ");This. Format = default_format;This. Writer = createwriter (Out, format. getencoding ());This. Autoflush =True; Namespacestack. Push (namespace. no_namespace );}PublicXmlwriter (outputstream out, outputformat format)ThrowsUnsupportedencodingexception {// system. Out. println ("in outputstream, outputformat ");This. Format = format;This. Writer = createwriter (Out, format. getencoding ());This. Autoflush =True; Namespacestack. Push (namespace. no_namespace);}/*** getanoutputstreamwriter, usepreferredencoding .*/ProtectedWriter createwriter (outputstream outstream, string encoding)ThrowsUnsupportedencodingexception {ReturnNewBufferedwriter (NewOutputstreamwriter (outstream, encoding);} from the above code, we can see that dom4j does not perform very complex processing on the encoding, and it is completely completed through the Java function. Therefore, when we use dom4j to generate our XML file, we should not directly assign a writer object to it when building xmlwriter, instead, it should be built through an outputstream subclass object. That is to say, in the above Code, we should not use filewriter objects to construct XML documents, but should use fileoutputstream objects to construct the documents. Therefore, we should modify the code to the following:
PublicVoidCreatexml (string filename) {document DOC = org. dom4j. incluenthelper. createdocument (); element root = Doc. addelement ("book"); root. addattribute ("name", "my books"); element childtmp; childtmp = root. addelement ("price"); childtmp. settext ("21.22"); element writer = root. addelement ("author"); writer. settext ("Li Si"); writer. addattribute ("ID", "001 ");Try{
// Note the changes here
Org. dom4j. Io. xmlwriter =NewOrg. dom4j. Io. xmlwriter (NewFileoutputstream (filename)Xmlwriter. Write (DOC); xmlwriter. Close ();}Catch(Exception e) {system. Out. println (e );}}

So far, the problem of dom4j coding has come to an end, and I hope this article will be useful to other friends.
Trackback: http://tb.blog.csdn.net/TrackBack.aspx? Postid = 160625

PublicVoidCreatexml (string filename ){

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.