Sizzle selector engine introduction, sizzle selector Engine
I. Preface
Sizzle was originally a selector engine in jQuery, and later gradually became an independent module, which can be freely introduced to other class libraries. I used it as a module in YUI3. It is unobstructed and there is no obstacle. Sizzle has developed to the present, taking jQuery1.8 as the watershed and can be divided into two stages in general. The concept of function compilation is introduced in later versions, and the source code of Sizzle becomes more difficult to read and is no longer compatible with earlier browsers, and it looks more fragmented. This is the final version of jQuery1.7 in the first phase of Sizzle, which has gained a lot from it. On the one hand, it is the framework design idea, and on the other hand it is the programming technique.
Ii. jQuery constructor
Sizzle comes from jQuery, and jQuery is a DOM-based class library. before studying Sizzle, it is necessary to look at the overall structure of jQuery:
(function(window, undefined) {var jQuery = function (selector, context) {return new jQuery.fn.init(selector, context, rootjQuery);}jQuery.fn = jQuery.prototype = {...}jQuery.fn.init.prototype = jQuery.fn; // utilities method// Deferred // support // Data cache// queue// Attribute// Event// Sizzle about 2k lines // DOM // css operation// Ajax// animation// position accountwindow.jQuery = window.$ = jQuery;})
JQuery has a strong engineering nature. An interface can process multiple types of input, which is the main reason for jQuery's ease of use. Correspondingly, the internal implementation of such an API with a huge function is quite complicated. To understand the relationship between jQuery and Sizzle, you must first start with the jQuery constructor. After sorting, the processing logic of the constructor is clarified. In the following table, the constructor of jQuery needs to handle 6 categories of situations, but only when processing the selector expression) the Sizzle selector engine is called.
Iii. Sizzle design ideas
For a complex selector expression (the premise is that the browser does not support querySelectorAll), how can we process it?
3.1 segmentation Analysis
For complex selector expressions, native APIs cannot directly Parse them, but can operate on some of them. Naturally, you can adopt a local first and then an overall policy: splits complex selector expressions into block expressions and inter-block relationships. We can see that 1. The selector expression is split and Split Based on the relationship between blocks. 2. There are many pseudo-class expressions in the block expression, which is a highlight of Sizzle, in addition, pseudo classes can be customized to demonstrate a strong engineering nature. 3. The split block expression may be a combination of simple selector, attribute selector, and pseudo expression, such as div. a ,. a [name = "beijing"].
3.2 block expression search
After the expression is split into block expressions, the next step is to find the result set. It has been declared in 3.1. At this time, the block expression may also be a complex selector expression. How can we deal with the combined block expression?
A. Search Based on API performance: For program developers, code efficiency is an eternal topic, and the query basis naturally depends on the selected performance. In the dom api, ID> Class> Name> Tag.
B. intra-block filtering: in the preceding step, the query is performed based on a part of the block expression. Obviously, the obtained set range is too large and some do not meet the conditions, then, you need to filter the element sets in blocks.
Conclusion: This step involves two steps: Search + [filter]. Simple Block expressions do not need to be filtered.
3.3 inter-block relationship Processing
After finding a basic set of elements in a block, how can we deal with inter-block relationships? Through observation, we can find that there are two orders for a complex selector expression:
- From left to right: traverse the obtained set one by one internally to obtain the new element set. As long as there are remaining code blocks, you need to repeatedly search and filter the set. The summary is: multiple searches and filters.
- From right to left: the set of elements to be obtained must include the final elements and redundant elements that do not meet the conditions. The next step is to constantly filter the elements, remove non-conforming elements.
For "adjacent sibling relationship (+)" and "subsequent sibling relationship (~)", It doesn't matter which method, and there is no difference in efficiency. However, the "parent-child relationship" and "ancestor-child relationship" are different. In this case, Sizzle selects the right-to-left relationship, which is explained in the following two dimensions:
A. Design Ideas
- Left-to-right: keeps querying, narrowing the context, and continuously getting new element sets
- Right-to-left: one query and multiple filtering. The element set obtained by the first query is constantly reduced, and the final set is obtained.
B. DOM tree
- Left-to-right: from the top of the DOM to the bottom layer, You need to traverse child or child element continuously, the number of child or child elements of an element node is unknown or large.
- Right to left: From the bottom layer of the DOM to the top layer, You need to constantly traverse the parent or ancestor elements, and the number of parent or ancestor elements of an element is fixed or limited.
However, from right to left is against our habits. will this happen? The answer is an error. Please refer to the following simple DOM tree:
<Div> <p> aa </p> </div> <div class = "content"> <p> bb </p> <p> cc </p> </div> evaluate $ ('. content> p: first ') element set? Split first :'. content> p: first '---> ['. content ','> ', 'P: first'] Right --> left search: A = $ ('P: first ') = ['<p> aa </p>'] filter:. parent (). isContainClass ('content') ---> null
In the above example, we can see that when the selector expression contains a location pseudo class, an error will occur. In this case, there is no way. accuracy is the first. You can only choose from left to right.
Conclusion: for accuracy, the position pseudo class can only be taken from left to right.
IV. Implementation of Sizzle the overall structure of Sizzle 4.1
If (document. querySelectorAll) {sizzle = function (query, context) {return makeArray (context. querySelectorAll (query) ;}} else {sizzle engine implementation, mainly simulating querySelectorAll}
The code above shows that the Sizzle selector engine is compatible with querySelectorAll APIs. If all browsers support this API, there is no need for Sizzle.
Key functions:
Sizzle = function (selector, context, result, seed): entry function of the Sizzle Engine
Sizzle. find: Main lookup Function
Sizzle. filter: primary filter function
Sizzle. selectors. relative: Inter-block relationship processing function set {"+": function () {}, "": function () {}, ">": function (){}, "~" : Function (){}}
4.2 segmentation Analysis
chunker = /((?:\((?:\([^()]+\)|[^()]+)+\)|\[(?:\[[^\[\]]*\]|['"][^'"]*['"]|[^\[\]'"]+)+\]|\\.|[^ >+~,(\[\\]+)+|[>+~])(\s*,\s*)?((?:.|\r|\n)*)/g
By using such a regular expression, you can split a complex and diverse selector expression into several block expressions and the relationship between blocks. Do you think block expressions are a magic technique, complex problems can be abstracted. The disadvantage of regular expressions is that it is not conducive to reading and maintenance and graph analysis:
Let's take a look at how it is implemented:
Do {chunker.exe c (""); // chunker. lastIndex = 0 m = chunker.exe c (soFar); if (m) {soFar = m [3]; parts. push (m [1]); if (m [2]) {extra = m [3]; break ;}} while (m); for example: $ ('# J-con ul> li: gt (2)') the parsed result is: parts = ["# J-con", "ul ", ">", "li: gt (2)"] extra = undefined $ ('# J-con ul> li: gt (2), div. menu ') the parsed result is: parts = ["# J-con", "ul", ">", "li: gt (2) "] extra = 'div. menu'
4.3 block Expression Processing 4.3.1 intra-block search
In the search stage, Sizzle. find is used. The main logic is as follows:
- The search criteria are determined based on the dom api performance: ID> Class> Name> Tag. You must consider whether the browser supports getElementsByClassName.
- Expr. leftMatch: Determine the block expression type
- Expr. find: Specific Search implementation
- Result: {set: result set, expr: remaining part of the block expression, used for intra-block filtering in the Next Step}
// Expr.order = [“ID”, [ “CLASS”], “NAME”, “TAG ]for ( i = 0, len = Expr.order.length; i < len; i++ ) { …… if ( (match = Expr.leftMatch[ type ].exec( expr )) ) { set = Expr.find[ type ]( match, context); expr = expr.replace( Expr.match[ type ], "" ); }}
4.3.2 intra-block filtering
This process is carried out through Sizzle. filter. This API can not only perform intra-block filtering, but also inter-block filtering, which is determined by the inplace parameter. The main logic is as follows:
- Expr. filter: {PSEUDO, CHILD, ID, TAG, CLASS, ATTR, POS}, type of the selector expression
- Expr. preFilter: Pre-processing before filtering to ensure format standardization
- Expr. filter: specific implementation object of Filtering
- Inplace = false, and new objects are returned. from right to left: inplace = true, the original element set is closed.
Sizzle. filter = function (expr, set, inplace, not) {for (type in Expr. filter) {// filter: {PSEUDO, CHILD, ID, TAG, CLASS, ATTR, POS} // Expr. leftMatch: Determine the selector type if (match = Expr. leftMatch [type cmd.exe c (expr ))! = Null & match [2]) {// pre-processing before filtering to ensure that the format is normalized match = Expr. preFilter [type] (match, curLoop, inplace, result, not, isXMLFilter); // perform the filter operation found = Expr. filter [type] (item, match, I, curLoop); // if inplace = true, get the new array object; if (inplace & found! = Null) {if (pass) {anyFound = true;} else {curLoop [I] = false;} else if (pass) {result. push (item );}}}}
4.4 inter-block relationship processing 4.4.1 judgment processing sequence
The following regular expression indicates that a location pseudo-class exists. To ensure accurate calculation, the processing sequence from left to left must be adopted. Otherwise, you can use the regular expression from right to left for efficiency.
origPOS = /:(nth|eq|gt|lt|first|last|even|odd)(?:\((\d*)\))?(?=[^\-]|$)/
4.4.2 left-to-right Processing
First, query based on the first element of parts, then traverse the obtained Element Set, and use the position pseudo class processing function posProcess to perform pseudo class processing until the array parts is empty.
// Parts is the split array set = Expr. relative [parts [0]? [Context]: Sizzle (parts. shift (), context); // combine element sets for multiple traversal, and constantly search for while (parts. length) {partition selector = parts. shift ();...... Set = posProcess (selector, set, seed); separator );}
Next, let's look at the internal logic of posProcess: If the expression contains a location pseudo class (such as p: first), there is no API in the dom api that can process the pseudo class (: first, in this case, we first remove the pseudo class and query (p) according to the remaining part. In this way, we can get a set of elements without the pseudo class, and finally use the pseudo class as the condition, filter the obtained element set.
// From left to back, the location pseudo class processing method var posProcess = function (selector, context, seed) {var match, tmpSet = [], later = "", root = context. nodeType? [Context]: context; // remove the position pseudo class first and save it in later while (match = expr.match.pseudo do.exe c (selector) {later + = match [0]; selector = selector. replace (Expr. match. PSEUDO, "");} selector = Expr. relative [selector]? Selector + "*": selector; // search for (var I = 0, l = root. length; I <l; I ++) {Sizzle (selector, root [I], tmpSet, seed);} // The position pseudo class is used as the condition, return Sizzle is used to filter the result set. filter (later, tmpSet );};
4.4.3 processing order from right to left
In fact, Sizzle is not completely from right to left. If the leftmost part of the selector expression has # id selector, it first queries the leftmost part and uses it as the execution context for the next step, the ultimate goal of narrowing down the context is quite comprehensive.
// If the leftmost part of the selector expression is # ID, the # ID selector is calculated to narrow down the execution context if (parts [0] is # id) {context = Sizzle. find (parts. shift (), context) [0];} if (context) {// gets the Element Set of the last edge block expression ret = Sizzle. find (parts. pop (), context); // For the just-obtained Element set, filter the Element set = Sizzle in the block. filter (ret. expr, ret. set); // filter while (parts. length) {pop = parts. pop ();...... Expr. relative [cur] (checkSet, pop );}}
The filtering of Inter-block relationships is mainly based on Expr. relative. The processing logic relationship is: judge whether the selector expression is a tag at this time. If yes, compare nodeName directly, increasing efficiency; otherwise, only Sizzle. filter can be called. The following describes the adjacent sibling relationship as an example:
"+": Function (checkSet, part) {var isPartStr = typeof part = "string", isTag = isPartStr &&! RNonWord. test (part), // checks whether the selector is a tag selector. isPartStrNotTag = isPartStr &&! IsTag; if (isTag) {part = part. toLowerCase ();} for (var I = 0, l = checkSet. length, elem; I <l; I ++) {if (elem = checkSet [I]) {while (elem = elem. previussibling) & elem. nodeType! = 1) {} checkSet [I] = isPartStrNotTag | elem & elem. nodeName. toLowerCase () = part? Elem | false: elem = part ;}} if (isPartStrNotTag) {Sizzle. filter (part, checkSet, true );}}
Scalability 4.5
Another major feature of Sizzle is the ability to customize selector. Of course, it is limited to pseudo classes. This is another manifestation of Sizzle's strong engineering model:
$.extend($.selectors.filters, { hasLi: function( elem ) { return $(elem).find('li').size() > 0; }});var e = $('#J-con :hasLi');console.log(e.size()); // 1
$. Extend is a combination of augment, extend, and mix in YUI3. It has powerful functions and only needs. selectors. filters (Sizzle. selectors. filters) the object can be expanded. The return value of each attribute is of the Boolean type, which is used to determine the type of the pseudo class.
How to Use the sizzle selector?
In fact, the elements obtained by Sizzle ("# test") are the same as those obtained by jQuery in an array. Sizzle ("# test") [0] or traversing this array is the dom element. In this way, the value can be obtained. If you get an array with length = 1, you cannot directly obtain the value.
To use js to implement a function, you need to obtain the corresponding object based on the given CSS basic selector or compound selector. How can this be achieved?
Var $ = (function () {var B = /(?: [\ W \-\. #] +) + (?: \ [\ W +? = ([\ '"])? (?: \ 1 |.) +? \ 1 \])? | \ * |>/Ig, g =/^ (?: [\ W \-_] + )? \. ([\ W \-_] +)/, f =/^ (?: [\ W \-_] + )? # ([\ W \-_] +)/, j =/^ ([\ w \ * \-_] +)/, h = [null, null]; function d (o, m) {m = m | document; var k =/^ [\ w \-_ #] + $ /. test (o); if (! K & m. querySelectorAll) {return c (m. querySelectorAll (o)} if (o. indexOf (",")>-1) {var v = o. split (/,/g), t = [], s = 0, r = v. length; for (; s <r; ++ s) {t = t. concat (d (v [s], m)} return e (t)} var p = o. match (B), n = p. pop (), l = (n. match (f) | h) [1], u =! L & (n. match (g) | h) [1], w =! L & (n. match (j) | h) [1], q; if (u &&! W & m. getElementsByClassName) {q = c (m. getElementsByClassName (u)} else {q =! L & c (m. getElementsByTagName (w | "*"); if (u) {q = I (q, "className", RegExp ("(^ | \ s) "+ u +" (\ s | $) ")} if (l) {var x = m. getElementById (l); return x? [X]: []} return p [0] & q [0]? A (p, q): q} function c (o) {try {return Array. prototype. slice. call (o)} catch (n) {var l = [], m = 0, k = o. length; for (; m <k; ++ m) {l [m] = o [m]} return l} function a (w, p, n) {var q = w. pop (); if (q = ">") {return a (w, p, true)} var s = [], k =-1, l = (q. match (f) | h) [1], t =! L & (q. match (g) | h) [1], v =! L & (q. match (j) | h) [1], u =-1, m, x, o; v = v & v. toLowerCase (); while (m = p [++ u]) {x = m. parentNode; do {o =! V | v = "*" | v = x. nodeName. toLowerCase (); o = o &&(! L | x. id = l); o = o &&(! T | RegExp ("(^ | \ s)" + t + "(\ s | $)"). test ...... remaining full text>