XML Database parsing efficiency mainly includes parsing, storage, export, traversal, modification, Xpath Positioning And so on.
XML There are three main access models, Dom , Sax , Pull .
Dom That is Document Object Model Is the most commonly used XML Resolution library. DomThe applicable scope is frequent untargeted random access and XSLT And so on. For example Xpath Query or you want to traverse , Dom. Good. Read-only and non-read-only, basically on the scale / Performance does not have much impact . Btw xslt is quite powerful.
If the format is basically fixed one-way read, no traversal is required. , Or one-time Traversal , Sax That's it. .
If the format is flexible and highly efficient Pull Applicable model, Xmllite Is based on Pull Model. In the fourth phase Msdn magazine About XML Lite .
Because Sax Yesby Reader Push all the content to you, Pull When necessary ReaderWhen a node is not processed, Sax Engine background resolution Pull You only need to do the simplest Tag Match to skip.
Dom It is also used for parsing. Sax Implementation , However, to maintain a complete tree that can be accessed and modified randomly
Memory consumption and resolution costs often exceed expectations. And Sax/pull For the model, load XML Time of the tree, And the size of memory used, basically an order of magnitude or above.
In essence, XML There are actually two usage policies. The document-based and Data Stream-based strategies are similar XHTML Or OpenDocument In this way, the structure is complex, nested, or even referenced, and the latter is similar RSS Or XMPP The format is relatively fixed and only needs to be parsed once. Of course, there are still many different policies that need to be analyzed independently. Therefore, how to select a resolution model depends on how you want to define the data model. In fact, even if you want to parse the model, there are many implementations that meet specific requirements. For example Pull Policy Dom Model, fast creation, and on-demand, and also based on Token , 64bitMark fast positioning for reading and modification, and optimize the hardware.
From http://blog.csdn.net/goldcattle/archive/2007/04/27/1586514.aspx