Definition of one or three types of resolution.
1. Dom Parsing
Dom indicates the Document Object Model, which is a file object model. when parsing an XML file, the entire file is loaded into the memory.
2. Sax Parsing
The Simple API for XML is an event-driven model. It is parsed step by step without loading the entire file into the memory.
3. Pull Parsing
Pull Parsing is a resolution method officially recommended by Android, which is similar to sax parsing.
Ii. Advantages and disadvantages of the three parsing methods.
Dom Parsing
The advantage is that after the parsing is complete, you can roll back to the parsing, because the file is loaded into the memory, no matter which node you want to resolve, it is okay.
The disadvantage is that the entire file needs to be loaded into the memory during parsing, which occupies space and consumes a lot of resources.
Sax Parsing
The advantage is that the parsing speed is fast, and it is not necessary to load the entire file into the memory, which consumes less and is more efficient.
The disadvantage is that it can only be parsed in order, and there is no way to roll back the parsing.
Pull Parsing
The difference with Sax Parsing is that it seems to be .... Sax parsing cannot be stopped at any time. It must be completely parsed. Pull is parsed in a while loop, so you can freely add it when it is parsed.
The SAX Parser automatically pushes events to the registered event processor for processing. Therefore, you cannot control the event processing to automatically end; the pull parser is used to allow your application code to actively retrieve events from the parser, because it is actively obtaining events, therefore, after meeting the required conditions, you can stop obtaining the event and complete the parsing. This is their main difference. [1]
Iii. Three types of parsing code implementation.
XML file
<?xml version="1.0" encoding="UTF-8"?><books> <book id="1"> <name>Developer</name> <price>20.0</price> </book> <book id="2" > <name>Engineer</name> <price>30.0</price> </book></books>
Java code
Package CN. my. utils; import Java. io. ioexception; import Java. io. inputstream; import Java. util. arraylist; import Java. util. list; import javax. XML. parsers. documentbuilder; import javax. XML. parsers. documentbuilderfactory; import javax. XML. parsers. parserconfigurationexception; import javax. XML. parsers. saxparser; import javax. XML. parsers. saxparserfactory; import Org. w3C. dom. document; import Org. w3C. dom. element; imp ORT Org. w3C. dom. nodelist; import Org. XML. sax. attributes; import Org. XML. sax. saxexception; import Org. XML. sax. helpers. defaulthandler; import Org. xmlpull. v1.xmlpullparser; import Org. xmlpull. v1.xmlpullparserexception; import Org. xmlpull. v1.xmlpullparserfactory; import CN. my. model. books; public class parser {/*** Dom parsing ** @ Param is * @ return list <books> */public list <books> dom_parser (inputstream is) {list <Books> DATA = new arraylist <books> (); try {documentbuilderfactory DBF = documentbuilderfactory. newinstance (); documentbuilder DB = DBF. newdocumentbuilder (); document DOC = dB. parse (is); // The file stream element root = Doc to be parsed by the DB. getdocumentelement (); // get the root node. Nodelist = root. getelementsbytagname ("book"); // obtain the list of book nodes from the root node for (INT I = 0; I <nodelist. getlength (); I ++) {books book = new books (); Element E = (element) nodelist. item (I); // The node of each book. Book. bookid = integer. valueof (E. getattribute ("ID"); // get the book attribute book. bookname = E. getelementsbytagname ("name "). item (0 ). getfirstchild (). getnodevalue (); // get the node named name and getfirstchild to get the text node. getnodevalue is the value of the text node, that is, the value of name. (Name) book. bookprice = float. valueof (E. getelementsbytagname ("price "). item (0 ). getfirstchild (). getnodevalue (); data. add (book) ;}} catch (parserconfigurationexception e) {e. printstacktrace ();} catch (saxexception e) {e. printstacktrace ();} catch (ioexception e) {e. printstacktrace ();} return data;}/*** sax parsing * @ Param is * @ return */public list <books> sax_parser (inputstream is) {saxparserfactory Sf = saxp Arserfactory. newinstance (); myhandler = new myhandler (); try {saxparser sp = SF. newsaxparser (); SP. parse (is, myhandler);} catch (parserconfigurationexception e) {e. printstacktrace ();} catch (saxexception e) {e. printstacktrace ();} catch (ioexception e) {// todo auto-generated catch blocke. printstacktrace ();} finally {} return myhandler. getdata ();}/*** pull resolution * @ Param is * @ return list <books> */Public list <books> pull_parser (inputstream is) {list <books> DATA = NULL; books book = NULL; try {xmlpullparserfactory xppf = xmlpullparserfactory. newinstance (); xmlpullparser xpp = xppf. newpullparser (); xpp. setinput (is, "UTF-8"); int event = xpp. geteventtype (); While (event! = Xmlpullparser. end_document) {Switch (event) {Case xmlpullparser. start_document: Data = new arraylist <books> (); break; Case xmlpullparser. start_tag: String STR = xpp. getname (); If ("book ". equals (STR) {book = new books (); book. bookid = integer. valueof (xpp. getattributevalue (0);} If ("name ". equals (STR) {book. bookname = xpp. nexttext ();} If ("price ". equals (STR) {book. bookprice = float. valueof (xpp. nexttext ();} break; Case xmlpullparser. end_tag: If ("book ". equals (xpp. getname () {data. add (book); book = NULL;} break;} event = xpp. next () ;}} catch (xmlpullparserexception e) {// todo auto-generated catch blocke. printstacktrace ();} catch (numberformatexception e) {// todo auto-generated catch blocke. printstacktrace ();} catch (ioexception e) {// todo auto-generated catch blocke. printstacktrace ();} return data;} public class myhandler extends defaulthandler {private list <books> data; private books book; private string tarname; @ overridepublic void startdocument () throws saxexception {super. startdocument (); Data = new arraylist <books> () ;}@ overridepublic void startelement (string Uri, string localname, string QNAME, attributes) throws saxexception {super. startelement (Uri, localname, QNAME, attributes); If ("book ". equals (localname) {book = new books (); book. bookid = integer. valueof (attributes. getvalue (0);} tarname = localname;} @ overridepublic void characters (char [] CH, int start, int length) throws saxexception {super. characters (CH, start, length); If ("name ". equals (tarname) {book. bookname = new string (CH, start, length);} If ("price ". equals (tarname) {book. bookprice = float. valueof (new string (CH, start, length) ;}tarname = NULL ;}@ overridepublic void endelement (string Uri, string localname, string QNAME) throws saxexception {If ("book ". equals (localname) {data. add (book); book = NULL ;}@overridepublic void enddocument () throws saxexception {super. enddocument () ;}public list <books> getdata () {return data ;}}}
Analysis:
Dom parsing: writes inputstream to the document and uses Doc. getdocumentelement (); to get the root node, that is, when it is parsed to <books>,
Then, use root. getelementbytagname ("book"); to obtain all nodes named book, and return a nodelist object, that is, the number of books, the length of nodelist,
Therefore, it is very appropriate to create a book object at this time. The next step is to continue to obtain the element nodes, namely name and price, for each book Node object.
Of course, the obtained values are encapsulated in the book object, and then the book is added to the data in the loop, and the result is returned.
First, we need to integrate a defaulthandler class for Sax parsing, because the entire Parsing is completed in this class.
The method to override is
Startdocument ()
Startelement (string Uri, string localname, string QNAME, attributes)
Characters (char [] CH, int start, int length)
Endelement (string Uri, string localname, string QNAME)
Only four of them are allowed.
Startdocument () is to start parsing the document. At this time, you can initialize a list of <books>. Because the next step is to add it in the parsing process, we need to initialize it here.
Startelement parses each element node. The localname parameter is the node name. We need to create a string tagname at the beginning to save the value of localname.
When the value of localname is the same as that of "book", it means that a book object is parsed, and a new book object can be created, and localname is saved in tagname. Use attribute to obtain the value of the bookid.
Characters is the content of the parsed text node. The saved tagname is compared with "name" and "price" at this time. If the tagname is the same, the bookname, the bookprice value is encapsulated. In this case, the tagname should be assigned a null value. Otherwise, when it is not empty, the second blank element node will be reached. If the tagname is not empty, A-judgment will be conducted, in addition, the value of the blank element node is paid to bookname and bookprice, and an incorrect resolution result is obtained.
Endelement: Judge. If the tagname value is book, it means to parse a book and add the book to Data. Then, empty the book object value.
Pull Parsing
What's better than sax Parsing is that you don't need to write this handler... it's actually similar --,
To obtain the xpp object and the stream source
Xpp. setinput (is, "UTF-8 ");
At this time, an int-type event is required to accept the node state. At this time, xpp. geteventtype is obtained and parsed in the while loop.
As long as event! = End_document, it will be parsed all the time. Is it easy to understand?
At this time, the switch is used to judge,
If event = startdocument, create a list <books> object to add the book
If event = startelement, then judge in nested if
If the value of getname is equal to book, the value of getnexttext is assigned to book. bookname. Similarly, price is the same.
If event = endelement, add if to judge. If the getname value is book, it means that a book is parsed. In this case, add the book to data and assign the value to null.
Finally, remember to move xpp, or how can we continue to parse it?
So event = xpp. Next ();
Appendix:
Node Problems
There are five element nodes to be parsed from the book, one for the book, and the blank black lines are the second, the third for the name, and the fourth for the second black line, price is the fifth
The node that is clamped between name and/name is also called a text node, which is also the reason for getfirstchild after obtaining the Name node, if firstchild is obtained, the text node is obtained, and then the correct value is obtained by getnodevalue.
There are probably so many things. As the notes you have learned, there will be more gains in the Summary. It may be a bit confusing in the language organization. If there is any problem in the text, if you have any suggestions or comments, please leave a message to me. Thank you.
PS: The red font is network copy .....
[1] source http://blog.csdn.net/leorowe/article/details/6841375