JQuery3.1.1 Source code Interpretation (iv) "Tokens lexical Analysis"

Source: Internet
Author: User
Tags regular expression rtrim
lexical analysis of Tokens

In fact, lexical analysis is the vocabulary mentioned in the compilation, it is used here to feel slightly inappropriate, but Sizzle in the Tokensize function is the lexical analysis of the work.

In the previous chapter we have talked about the use of Sizzle, which is actually jquery.find function, but also involves JQuery.fn.find. The Jquery.find function is thoughtful and simple to handle #id,. class, and TagName, with a regular expression rquickexpr separate the content and, if the browser supports Queryselectorall, it's the best.

More difficult to count this is similar to the CSS selector selector,div > div.seq h2 ~ p, #id p, if you use a left-to-right lookup rule, the efficiency is very low, and from right to left, you can improve efficiency.

This chapter introduces the Tokensize function to see how it handles complex selector into tokens.

We take div > div.seq h2 ~ p, #id p For example, this is a very simple CSS, comma, the expression is divided into two parts. There are some basic symbols in the CSS, it is necessary to emphasize, for example, space, >, +, ~: Div,p, representing the side relationship, all DIV elements and P elements; div p spaces represent descendant elements, all p elements within the div element; div>p > The child element and the difference can only be a generation, the parent element is a div of all p elements; Div+p + represents the immediate sibling element, the previous sibling node is all p elements of the div; div~p ~ denotes the sibling element, all the p elements that precede the sibling element Div.

In addition, there are some a, input is more special: A[target=_blank] Select all of the target is _blank all the a element; A[title=search] Select all the A-elements of the title search; INP Ut[type=text] Selects all input elements of type text, and P:nth-child (2) selects all the P elements that are the second element of the parent element;

Sizzle support these syntaxes, if we call this step lexical analysis, then the result of lexical analysis is a thing.

div > Div.seq h2 ~ p, #id p passes through Tokensize (selector) to return an array, which is called groups in the function, which has two elements, TOKENS0 and tokens1, respectively, representing the two parts of the selector. Tokens is also an array, and each of its elements is a token object.

The token object structure is said as follows:

Token: {
  value:matched,//matched to the string
  type:type,//token type
  matches:match//Remove the value of the regular result array
}

There are several types of Sizzle in the following categories: ID, CLASS, TAG, ATTR, PSEUDO, child, BOOL, Needscontext, several of which I do not know what the meaning of, child means nth-child, even, odd This seed selector. This is the case for the existence of matches, and for cases where matches does not exist, its type is the trim () operation of value, which is discussed later.

tokensize function to selector processing, even the space is not spared, because the space is also a kind of type, but also very important, div > div.seq h2 ~ P processing Results:

Tokens: [
  [Value: ' div ', type: ' TAG ', matches:array[1]],
  [value: ' > ', type: ' > '],
  [value: ' div ', type: ' TAG ', matches:array[1]],
  [value: '. Seq ', type: ' CLASS ', matches:array[1]],
  [value: ', type: '],
  [value: ' H2 ', type: ' tag ', matches:array[1]],
  [value: ' ~ ', type: ' ~ '],
  [value: ' P ', type: ' tag ', matches:array[1]],
]

This array will be handed over to Sizzle's next process, which is not discussed today. Tokensize Source

As usual, take a look at a few regular expressions first.

var rcomma =/^[\x20\t\r\n\f]*,[\x20\t\r\n\f]*/;
Rcomma.exec (' div > div.seq h2 ~ p ');//null
rcomma.exec (', #id P ');//[","]

Rcomma this regular, is mainly used to distinguish whether the selector to the next rule, if the next rule, the previously processed push to groups. In this regular [\x20\t\r\n\f] is used to match similar to whitespace, the body is a comma.

var rcombinators =/^[\x20\t\r\n\f]* ([>+~]|[ \X20\T\R\N\F]) [\x20\t\r\n\f]*/;
Rcombinators.exec (' > Div.seq h2 ~ p '); [">", ">"]
rcombinators.exec (' ~ p '); ["~", "~"]
rcombinators.exec (' H2 ~ p '); //[" ", " "]

It seems rcombinators this regular expression, the above tokens the contents of the array can be fully understood.

In fact, if you look at the source of JQuery, Rcomma and rcombinators are not defined as such, but are defined in the following way:

var whitespace = "[\\x20\\t\\r\\n\\f]";
var rcomma = new RegExp ("^" + whitespace + "*," + whitespace + "*"),
  rcombinators = new RegExp ("^" + whitespace + " * ([>+~]| "+ whitespace +") "+ whitespace +" * "),
  RTrim = new RegExp (" ^ "+ whitespace +" +| ( (?:^| [^\\\\]) (?:\ \\\.) *) "+ whitespace +" +$ "," G "),

Sometimes you have to admire the approach of JQuery, the contributors, and the province, where every code is perfect.

There are also two objects, Expr and matchexpr,expr, which are very critical objects that cover almost all of the possible parameters, and more important parameters such as:

Expr.filter = {
  "TAG": function () {...},
  "CLASS": function () {...},
  "ATTR": function () {...},
  "child": function () {...},
  "ID": function () {...},
  "PSEUDO": function () {...}
}
Expr.prefilter = {
  "ATTR": function () {...}, "Child
  ": function () {...},
  "PSEUDO": function () {...}
}

This filter and prefilter are the key steps to deal with Type=tag, including some similar to input[type=text] and these functions are also more complex, I am confused. There are also matchexpr regular expressions:

var identifier = "(?: \ \\\.| [\\w-]| [^\0-\\xa0]) + ", attributes =" \\["+ Whitespace +" * ("+ identifier +") (?: "+ whitespace +//Operator (Capture 2)" * ([*^ $|!
    ~]?=) "+ whitespace +//" Attribute values must be CSS identifiers [capture 5] or strings [Capture 3 or capture 4] " "*(?:'((?:\ \\\.| [^\\\\']) *)'|\"((?:\ \\\.| [^\\\\\"]) *)\"| ("+ identifier +")) |) " + whitespace + "*\\]", Pseudos = ":(" + identifier + ") (?: \ \ (("+//To reduce the number of selectors needing tokenize in the prefilter, prefer arguments://1. Quoted (capture 3; Capture 4 or capture 5) "(' (?: \ \\\.| [^\\\\']) *)'|\"((?:\ \\\.| [^\\\\\"]) *) \ ") |" +//2. Simple (Capture 6) "(?: \ \\\.| [^\\\\()[\\]]|" + attributes + ") *) |" +//3. Anything else (capture 2) ". *" + "\ \) |)", booleans = "Checked|selected|async|autofocus|autoplay|controls|defe
R|disabled|hidden|ismap|loop|multiple|open|readonly|required|scoped "; var matchexpr = {"ID": New RegExp ("^# (" + identifieR + ")", "CLASS": New RegExp ("^\\. (" + identifier + ")"), "TAG": New RegExp ("^ (" + identifier + "|[ *]), "ATTR": New RegExp ("^" + attributes), "PSEUDO": New RegExp ("^" + pseudos), "Child": New RegExp ("^:(only |first|last|nth|nth-last)-(Child|of-type) (?: \ \ ("+ Whitespace +" * (even|odd| ( ([+-]|) (\\d*) n|) " + Whitespace + "* (?:( [+-]|)" + Whitespace + "* (\\d+) |)" + whitespace + "*\\) |)", "I"), "bool": New RegExp ("^ (?:" + Booleans + ") $", "I"),//for use in libraries implement Ing. is ()///The matching in ' select ' "Needscontext": New RegExp ("^" + whitespace + "*[>+~]|:( Even|odd|eq|gt|lt|nth|first|last) (?: \ \ ("+ Whitespace +" * (?:-\ \d) \\d*) "+ whitespace +" *\\) |) (? =[^-]|$) "," I ")}

Matchexpr as a regular expression object, each item of its key is a type, matching the type to a subsequent function.

Tokensize source code is as follows:

var tokensize = function (selector, parseonly) {var matched, match, tokens, type, sofar, groups, prefilters, cached = t
  Okencache[selector + ""];
  Tokencache represents the token buffer and retains the processed token if (cached) {return parseonly? 0:cached.slice (0);
  } Sofar = selector;
  groups = [];

  Prefilters = Expr.prefilter; while (SOFAR) {//Determines whether a grouping ends if (!matched | | (match = Rcomma.exec (Sofar)))
      {if (match) {//Remove match sofar from string = Sofar.slice (match[0].length) | | sofar;
    } groups.push ((tokens = []));

    } matched = false;
      Connector rcombinators if (match = Rcombinators.exec (Sofar))) {matched = Match.shift ();
      Tokens.push ({value:matched, type:match[0].replace (RTrim, "")});
    Sofar = Sofar.slice (matched.length); }//filter, Expr.filter and matchexpr have been introduced for the for (type in expr.filter) {if (match = Matchexpr[type].exec (sofar )) && (!prefilters[type] | | (match = PRefilters[type] (Match))) {matched = Match.shift (); The match in this case is actually the remaining array after shift () Tokens.push ({value:matched, type:type, Matches:
        Match});
      Sofar = Sofar.slice (matched.length);
    }} if (!matched) {break; 
    }}//parseonly This parameter should be used later to return parseonly? 
      SoFar.length:soFar?
Sizzle.error (selector)://Cache Tokencache (selector, groups). Slice (0); }

Not only arrays, strings also have slice operation, and look at the source word, jQuery in the interception of strings, use is the slice method.

If the parseonly is not present at this point, the return result needs to be looked up from the Tokencache function:

var tokencache = Createcache ();
function Createcache () {
  var keys = [];

  function cache (key, value) {
    //expr.cachelength =
    Keys.push (key + "") > Expr.cachelength) {
      Delete, the least frequent use of the deletion
      cache[keys.shift ()];
    }
    The entire result returns the value return
    (cache[key + ""] = value);
  }
  return cache;
}

The result of the return is that Groups,tokensize is finished, and the following chapter will introduce the follow-up of Tokensize. Summary

For a complex selector, its tokensize process is far more complex than today's introduction, today's example is a bit simpler (in fact, more complicated), the later content is more exciting. Reference

JQuery 2.0.3 Source Analysis Sizzle engine-lexical parsing

CSS Selector Reference Manual

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.