The sensitive characters in XML are angle brackets, and if the value of XML contains angle brackets, then the parsing will be an error, such as:
<?xml version= "1.0" encoding= "UTF-8"?><books><book ><ID>1</ID><name>< three countries <>< play >< play >Yi</name><price >4<>5</price ><author>Luo Guan Zhong</author></book ></Books>
These XML files need to be processed first.
One of my approximate ideas is this:
First use the regular to find all the labels, then put the label into a ArrayList, and then control the value in the ArrayList, the end of the label angle brackets to the marked string, and then escape all the remaining angle brackets, and then the token string into angle brackets.
The code is as follows:
ImportJava.util.ArrayList;ImportJava.util.List;ImportJava.util.regex.Matcher;ImportJava.util.regex.Pattern; Public class filterxmlutil { /** * Pass in an XML string and return the extra <> escape after * @param xmlstr * @return * * Public StaticStringFilterillegalitychar(String xmlstr) {//A collection for storing labelslist<string> tags =NewArraylist<string> ();//Remove the first two angle bracketsXmlstr = Xmlstr.replace ("<?xml","? xml"). Replace ("\" utf-8\ "?> "," \" utf-8\ "?");//Find all the labels through the regularPattern tag = Pattern.compile ("< ([a-za-z0-9]+) >"); Matcher mc = Tag.matcher (XMLSTR); while(Mc.find ()) {//Match is successfully saved in listTags.add (Mc.group (1)); }/** * Temporary replacement symbol * <-----------> ^^ * >----------->~~ * </---------->##/ */ for(inti =0; I<tags.size (); i++) {xmlstr = Xmlstr.replaceall ("<"+ Tags.get (i) +">","^^"+tags.get (i) +"~~"). ReplaceAll ("</"+tags.get (i) +">","##/"+tags.get (i) +"~~"); }//EscapeXmlstr = Xmlstr.replaceall ("<","<"). Replace (">",">");//Convert backXmlstr = Xmlstr.replace ("^^","<"). Replace ("~~",">"). Replace ("##/","</"). Replace ("? xml","<?xml"). Replace (" \" utf-8\ "?","\" utf-8\ "?> ");returnXMLSTR; }}
Output:
<?xml version= "1.0" encoding= "UTF-8"?><books><book ><ID>1</ID><name>< three countries <>< play >< > righteousness</name><price >4<>5</price ><author>Luo Guan Zhong</author></book ></Books>
This will parse the XML string.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Redundant angle brackets In the Java escape XML