XML parsing Method
Sax Parsing Methods
SAX (simple API for XML) is an alternative to XML parsing.
Compared to Dom,sax is a faster and more efficient method. It scans the document row by line, parsing while scanning.
And you can stop parsing at any point in the parsing document, compared to Dom,sax.
The advantages and disadvantages are:
Pros: Parsing can start immediately, fast, without memory pressure
Cons: Cannot modify the Dom parsing method of the node
DOM: (The Document Object model, or the file-objects models) is a way of dealing with XML recommended by the organization.
When parsing an XML document, the DOM parser will put all the elements in the document,
According to the hierarchical relationship, it resolves into node objects (nodes).
The advantages and disadvantages are:
Advantage: The XML file is constructed in memory to construct a tree structure that can traverse and modify nodes
Disadvantage: If the file is large, memory pressure, parsing time will be longer read and write XML demo download –> XML parsing and generation and XPath saxreader reading XML document
The Dom4j-full.jar package needs to be imported using Saxreader. It is a core API of DOM4J for reading XML documents.
DOM4J is a Java XML API, similar to Jdom, used to read and write XML files. XPATH Path Expression
XPath is a language that looks for information in an XML document.
XPath can be used to traverse elements and attributes in an XML document. path-expression syntax:
Slash (/) as a delimiter inside the path.
The same node has the absolute path and the relative path two kinds of wording:
The path (absolute path) must be preceded by a "/" followed by the root node, such as/step/step/....
The relative path (relative path) is written in addition to the absolute path, such as Step/step, which means "/" is not used.
“.” Represents the current node.
“..” Represents the parent node of the current node
NodeName (node name): Indicates that all child nodes of the node are selected
"/": Indicates select root node
"//": Indicates that a node is selected anywhere
"@": Indicates the selection of a property
We use the following XML to illustrate:
<?xml version= "1.0" encoding= "iso-8859-1"?>
<bookstore>
<book> <title lang= "eng" > Harry potter</title> <price>29.99</price> </book>
<book> <title lang= "eng" > Learning xml</title> <price>39.95</price> </book>
</bookstore>
The commonly used expressions are as shown in the table:
predicate
The so-called "predicate condition" is an additional condition on the path expression.
All the conditions are written in square brackets "[]", which means further filtering of the nodes.
Common predicates are shown in the table:
wildcard characters
The wildcard characters are used as follows:
"*" means matching any element node.
"@*" means matching any property value.
Node () indicates that any type of nodes is matched.
Common wildcard usages are as follows:
dom4j support for XPath demo download –> XML parsing and generation and XPath
DOM4J supports the use of XPath to retrieve XML content. To use the need to introduce a jar package into the project, the jar package name should be: Jaxen-xx-xx.jar. Versions may have slightly different names.
If you do not reference this package, the program throws:
Java.lang.noclassdeffounderror:org/jaxen/jaxenexception exception.
Document provides a way to retrieve the XPath:
List selectnodes (stirng XPath)
Gets the appropriate information by passing in the XPath path.