Original: Http://ejohn.org/blog/xpath-css-selectors
Lately, I've done a lot of work to implement a parser that supports both XPath and CSS 3, and what surprises me is that they are very similar in some ways, but in other ways they are completely different. Different places, CSS is used to work with HTML, you can use #id to get the element based on the ID, and using the. class to get elements based on class. These are not as concise as XPath implementations, and in turn, XPath can be used. To return to the upper node of the DOM tree, you can also use Foo[bar] to get a FOO element that has a bar child element. CSS selectors do not do this completely, summed up is that, and XPath, CSS selectors are usually relatively short, but unfortunately not strong enough.
I think it's valuable to make a comparison between the two types of selectors.
/tbody>
Target |
CSS 3 |
XPath |
all elements |
* |
//* |
All P elements |
p |
//p |
All child elements of P element |
p > * |
//p/* |
get elements based on ID |
#foo |
//*[@id = ' foo '] |
get elements by class |
.foo |
//*[contains (@ class, ' foo ')] 1 |
element with a property |
*[title] |
//*[@title] |
First child element of all P elements |
p > *:first-child |
//p/*[0] |
All p elements that have child element a |
cannot Implement |
//p[a] |
Next sibling element |
p + * |
//p/following-sibling::* [0] |
Syntactically, I was amazed at the similarity between the two selectors in some cases, especially between ' > ' and '/'. Although they do not always have the same functionality (XPath depends on the axes being used), they usually refer to the child elements of a parent element. Also, the blank character ' and '//' means all descendant elements of the current element. The last is the asterisk ' * ', similar to a wildcard character, representing all elements, regardless of the label name.
1 This is actually incorrect, because it will not only match the ' foo bar ' we want, but also accidentally match ' foobar '. The correct wording can be very complex and may require multiple expressions to complete. The following is the translator note: The wrong XPath in the table above
:*[contains (@class, ' foo ')]
My implementation of the wording is:
*[@class = ' foo ' or contains (@class, ' foo ') or Starts-with (@class, ' foo ') or substring (@class, string-length (@class)-3) = ' Foo ']
Compared to the CSS. Foo, it's really complicated, and I'll explain that if you include ' foo ' in the class attribute of an element, there are four possible scenarios in which the table is listed:
class= "Foo" |
*[@class = ' foo '] |
Class attribute has only one value foo |
class= "Foobar foo bar" |
*[@class = ' foo '] |
Class attribute value, Foo is in the middle of a value on the other side |
class= "Foo Bar" |
*[starts-with (@class, ' foo ')] |
Class attribute value, Foo is on the leftmost |
class= "Bar Foo" |
*[substring (@class, String-length (@class)-3) = ' Foo '] |
In the class attribute value, Foo is on the far right, there is no ends-with function in XPath1.0, 2.0 has, now the browser implements 1.0 |
So can we use XPath in Web development? Initially, jquery supported the XPath selector, but later, because of the efficiency issue, jquery gave up support for XPath. Exactly, Google released the Wicked good XPath last month, which is a DOM The Level 3 XPath specification for pure JavaScript implementation is also the fastest in its class, and we can combine this script with jquery.
jQuery.getScript ("http://wicked-good-xpath.googlecode.com/files/wgxpath.install.js") .success (function () {// Load the library file
wgxpath.install (); // Install XPath support
jQuery.xpath = function (xpath) {
var elements = []; // Used to store the elements selected by XPath
var xpathResult = document.evaluate (xpath, document, null, 6, null);
for (var i = 0; i <xpathResult.snapshotLength; i ++) {
elements.push (xpathResult.snapshotItem (i));
}
return jQuery (elements); // Pass to jQuery factory method, return jQuery object
}
})
This makes it possible to select elements through the $.xpath () static method, which returns a JQuery object, with no difference to using $ (). This page has been loaded with this script, you can now open the console to experiment with the $.xpath method.
So we have CSS selectors, why use XPath, the answer is: Sometimes, XPath is a little more powerful. For example:
In the table that John Resig summarizes, there is a function that CSS cannot do to find the parent element that contains a child element. Indeed, the current CSS is not yet available, but in the future CSS4 selector, there will be a parent selector
E! > F //Note that in 2011, the syntax of the parent selector is$E > F,今年草案又改了.网上有些介绍CSS4选择器的博文还是旧的,这里有一个能在CSS文件中使用父选择器的polyfill https://github.com/Idered/cssParentSelector
The selector can be selected to those e elements that contain the child element F. But even after the implementation of the CSS4, slightly changing the requirements, looking for those containing the descendant element F's e element, CSS selector How to write it? There should be no way to achieve it. Friends familiar with jquery may say that jquery has: has pseudo class, can write so E:has (F), indeed, if you use jquery custom filters, almost any requirement can be implemented by traversing the DOM, but the efficiency will be very low. and XPath is different. After all, both Firefox and Chrome have implemented the XPath interface Document.evaluate method (Wicked Good XPath should be primarily an effort to implement a unified interface on IE), Speed is certainly faster than manually traversing the DOM. The way XPath is written is this//e[.//f], how, it's pretty straightforward.
Another important point is that the CSS is used to add style to HTML, 12 node types, only the element node (NodeType equals 1) only the style of the said, therefore, the CSS selector can only select the element node in the page, and XPath is not, It can be used not only in HTML, but also in XML, in addition to element nodes, you can choose the attribute node (//@*) or text node (//text ()), and so on, if the future XPath2.0 implementation, it will become more powerful.
The following is a concrete operation of the XML for C #
How to complete. NET read and write operations of XML documents
C # Operations XML Selectnodes,selectsinglenode always returns null with XPath introduction
parsing XML files with multiple namespaces using the selectSingleNode method in C #
Original address
Bill: C # Operation XML Selectnodes,selectsinglenode to locate a div containing contains through XPath