Post address: http://java.chinaitlab.com/advance/755393.html
Description: JDOM output XML file, when the use of character encoding GBK normal, and the output of UTF-8 garbled.
The perfect solution starts with rumors:
1) JDOM whether to generate the UTF-8 file and format is not set, only output other character encoding needs to be set, see the following comments.
2) the root cause of JDOM output UTF-8 file garbled is not in jdomapi, but in JDK.
Description:
The output class xmloutputter of JDOM has two output interfaces. Apart from having one document parameter, both the writer and outputstream parameters are accepted.
This gives us the illusion that two interfaces can be used at will.
First, we use output (Doc, system. Out) for testing. At this time, we get garbled characters,
Then we change it to output (Doc, new printwriter (system. Out) for testing. The output is not garbled,
That is to say, you must use a writer interface to wrap it in the console.
Then we use output (Doc, new filewriter (PATH) for testing, but the results are garbled,
Then we change it to output (Doc, new fileoutputstream (PATH) for testing. The output is not garbled,
In other words, you must use an outputstream interface to encapsulate the output file.
Crazy? It's funny, right. After debugging in the JDOM source code, we found that there was no problem, and the problem was found in JDK.
Corresponding JDK interface processing:
1) The printwriter class has a constructor whose parameter is outputstream. Therefore, it can be packaged from system. Out to printwriter.
2) The filewriter class does not have a constructor whose parameter is outputstream. Therefore, it cannot be packaged from fileoutputstream to filewriter.
3) if the printwriter class uses the constructor whose parameter is writer (The writer is implemented as filewriter), the output is garbled.
4) if a fileoutputstream is used to encapsulate a console output, it is garbled.
Therefore, we must fully understand the various inputstream, outputstream, reader, and writer output systems in JDK. Otherwise, unexpected problems may occur.
Tested JDOM versions: 1.0 and 1.1
Test code:
Import Java. io. file; <br/> Import Java. io. fileoutputstream; <br/> Import Java. io. filewriter; <br/> Import Java. io. printwriter; <br/> Import Java. util. hashmap; <br/> Import Org. JDOM. document; <br/> Import Org. JDOM. element; <br/> Import Org. JDOM. output. format; <br/> Import Org. JDOM. output. xmloutputter; <br/> public class buildxml {<br/> Public static void main (string [] ARGs) throws exception {<br/> file xmlfile = new file ("C: // edittemp // XML // ABC. XML "); <br/> // Chinese problem // GBK is no problem, but UTF-8 is problematic <br/> // cause: <br/> // 1) for disk files, the output stream fileoutputstream must be used. <br/> // filewriter out = new filewriter (xmlfile ); may cause garbled characters <br/> // 2) For console output, you must use printwriter. If you directly use system. out may also contain garbled characters <br/> // printwriter out = new printwriter (system. out); <br/> fileoutputstream out = new fileoutputstream (xmlfile); <br/> element eroot = new element ("root"); <br/> eroot. addcontent (new element ("Code ")). addcontent ("Code"); <br/> eroot. addcontent (new element ("ds ")). addcontent ("Data Source"); <br/> eroot. addcontent (new element ("SQL ")). addcontent ("Search SQL"); <br/> eroot. addcontent (new element ("order ")). addcontent ("sort"); <br/> document DOC = new document (eroot); <br/> xmloutputter outputter = new xmloutputter (); <br/> // if the format is not set, it is only not indented. XML is still UTF-8, so format is not necessary <br/> Format F = format. getprettyformat (); <br/> // F. setencoding ("UTF-8"); // default = UTF-8 <br/> outputter. setformat (f); <br/> outputter. output (Doc, out); <br/> out. close (); <br/>}< br/>}
An additional method is provided to output the Document Object of JDOM according to the specified encoding:
/** <Br/> * This method returns the JDOM Document Object Based on the specified encoded conversion string. <Br/> * @ Param xmldoc: The JDOM object to be converted <br/> * @ Param encoding: encoding used by the output string <br/> * @ return string document generated after processing string <br/> * @ throws ioexception <br/> */<br/> Public static string toxml (document xmldoc, string encoding) throws ioexception <br/>{< br/> bytearrayoutputstream byterep = new bytearrayoutputstream (); <br/> printwriter out = new printwriter (byterep ); <br/> format = format. getprettyformat (); <br/> format. setencoding (encoding); <br/> xmloutputter docwriter = new xmloutputter (format); <br/> try {<br/> docwriter. output (xmldoc, out); <br/>}catch (exception e) {<br/>}< br/> return byterep. tostring (); <br/>}