The following analysis is based on jQuery-1.10.2.js version.
The following will take $("div:not(.class:contain('span')):eq(3)") as an example to explain how the tokenize and preFilter codes are coordinated to complete the parsing. If you want to know the detailed explanation of each line of code of the tokenize method and preFilter class, please refer to the following two articles:
http://www.jb51.net/article/63155.htm
http://www.jb51.net/article/63163.htm
The following is the source code of the tokenize method. For simplicity, I have removed all the codes related to caching, comma matching and relational character matching, leaving only the core code related to the current example. The code that was removed is very simple. If necessary, you can read the above article.
In addition, the code is written above the description text.
while (soFar) {
if (!matched) {
groups.push(tokens = []);
}
matched = false;
for (type in Expr.filter) {
If ((match = matchExpr[type].exec(soFar))
&& (!preFilters[type] || (match = preFilters[type]
(match)))) {
Matched = match.shift();
tokens.push({
Value: matched,
Type : type,
matches: match
});
SoFar = soFar.slice(matched.length);
}
}
if (!matched) {
Break;
}
}
return parseOnly ? soFar.length : soFar ? Sizzle.error(selector) :
tokenCache(selector, groups).slice(0);
}
soFar = "div:not(.class:contain('span')):eq(3)"
When entering the while loop for the first time, since matched has not been assigned a value, the following statement body in the if is executed. This statement will initialize the tokens variable and push tokens into the groups array.
The first for loop: take the first element "TAG" from Expr.filter and assign it to the type variable, and execute the loop body code.
The execution result of match = matchExpr[type].exec(soFar) is as follows:
match =["div", "div"]
The first selector in the example is div, which matches the regular expression of matchExpr["TAG"], and preFilters["TAG"] does not exist, so the statement body within the if is executed.
Remove the first element div in the match and assign the element to the matched variable. At this time, matched="div", match = ["div"]
Create a new object { value: "div", type: "TAG", matches: ["div"] } and push the object into the tokens array.
The soFar variable deletes the div. At this time, soFar=":not(.class:contain('span')):eq(3)"
The second for loop: Take the second element "CLASS" from Expr.filter and assign it to the type variable, and execute the loop body code.
Since the current soFar=":not(.class:contain('span')):eq(3)" does not match the regular expression of CLASS type, this loop ends.
The third for loop: Take the third element "ATTR" from Expr.filter and assign it to the type variable, and execute the loop body code.
Similarly, since the current remaining selectors are not attribute selectors, this cycle ends.
The fourth for loop: Take the fourth element "CHILD" from Expr.filter and assign it to the type variable, and execute the loop body code.
Similarly, since the current remaining selector is not a CHILD selector, this cycle ends.
The fifth for loop: Take the fifth element "PSEUDO" from Expr.filter and assign it to the type variable, and execute the loop body code.
The execution result of match = matchExpr[type].exec(soFar) is as follows:
[":not(.class:contain('span')):eq(3)", "not", ".class:contain('span')):eq(3", undefined, undefined, undefined, undefined , undefined, undefined, undefined, undefined]
Since preFilters["PSEUDO"] exists, the following code is executed:
preFilters["PSEUDO"] code is as follows:
if (matchExpr["CHILD"].test(match[0])) {
return null;
}
if (match[3] && match[4] !== undefined) {
match[2] = match[4];
} else if (unquoted
&& rpseudo.test(unquoted)
&& (excess = tokenize(unquoted, true))
&& (excess = unquoted.indexOf(")", unquoted.length
- excess)
- unquoted.length)) {
match[0] = match[0].slice(0, excess);
match[2] = unquoted.slice(0, excess);
}
return match.slice(0, 3);
}
The match parameter passed in is equal to:
unquoted = ".class:contain('span')):eq(3"
match[0] = ":not(.class:contain('span')):eq(3)", does not match the matchExpr["CHILD"] regular expression, and does not execute the return null statement.
Since match[3] and match[4] are both equal to undefined, the else statement body is executed.
At this time, unquoted = ".class:contain('span')):eq(3" is true, and because unquoted contains:contain('span'), it matches the regular expression rpseudo, so rpseudo. test(unquoted) is true, and then call tokenize again to parse unquoted again, as follows:
When calling the tokenize function this time, the incoming selector parameter is equal to ".class:contain('span')):eq(3", and parseOnly is equal to true. The execution process in the function body is as follows:
soFar = ".class:contain('span')):eq(3"
When entering the while loop for the first time, since matched has not been assigned a value, the following statement body in the if is executed. This statement will initialize the tokens variable and push tokens into the groups array.
, enter the for statement.
The first for loop: take the first element "TAG" from Expr.filter and assign it to the type variable, and execute the loop body code.
Since the current remaining selector is not a TAG selector, this cycle ends.
The second for loop: Take the second element "CLASS" from Expr.filter and assign it to the type variable, and execute the loop body code.
The execution result of match = matchExpr[type].exec(soFar) is as follows:
match = ["class" , "class"]
Since preFilters["CLASS"] does not exist, the statement body within the if is executed.
Remove the first element class in match and assign the element to the matched variable. At this time, matched="class", match = ["class"]
Create a new object { value: "class", type: "CLASS", matches: ["class"] } and push the object into the tokens array.
The soFar variable deletes the class. At this time, soFar = ":contain('span')):eq(3"
The third for loop: Take the third element "ATTR" from Expr.filter and assign it to the type variable, and execute the loop body code.
Similarly, since the current remaining selectors are not attribute selectors, this cycle ends.
The fourth for loop: Take the fourth element "CHILD" from Expr.filter and assign it to the type variable, and execute the loop body code.
Similarly, since the current remaining selector is not a CHILD selector, this cycle ends.
The fifth for loop: Take the fifth element "PSEUDO" from Expr.filter and assign it to the type variable, and execute the loop body code.
The execution result of match = matchExpr[type].exec(soFar) is as follows:
[":contain('span')", "contain", "'span'", "'", "span", undefined, undefined, undefined, undefined, undefined, undefined]
Since preFilters["PSEUDO"] exists, the following code is executed:
The preFilters["PSEUDO"] code is shown above and will not be listed here.
Because ":contain('span')" does not match the matchExpr["CHILD"] regular expression, the internal statement body is not executed.
Since match[3] = "'" and match[4] ="span", the internal if statement body is executed and "span" is assigned to match[2]
Returns a copy of the first three elements of match
At this time, return to the for loop of the tokenize method to continue execution. At this time, the values of each variable are as follows:
match = [":contain('span')", "contain", "span"]
soFar = ":contain('span')):eq(3"
Remove ":contain('span')" from the match array and assign it to the matched variable
Create a new object { value:
":contain('span')", type:"PSEUDO", matches: ["contain", "span"] }, and push the object into the tokens array.
The soFar variable deletes ":contain('span')". At this time, soFar="):eq(3)", after that, until the for loop ends and the while loop is executed again, there is no valid selector. So exit the while loop.
Since parseOnly = true at this time, the length of soFar at this time is returned, 6, and the code of preFilters["PSEUDO"] continues to be executed
Assign 6 to the excess variable, and then the code
Calculate: not selector end position (i.e. right bracket position) 22
Calculate the complete :not selector string (match[0]) and the string in its brackets (match[2]) respectively, which are equal to:
match[0] = ":not(.class:contain('span'))"
match[2] = ".class:contain('span')"
Returns a copy of the first three elements in match.
Return to the tokenize function, now match = [":not(.class:contain('span'))", "not", ".class:contain('span')"]
Remove the first element ":not(.class:contain('span'))" in match and assign the element to the matched variable. At this time, matched="":not(.class:contain( 'span'))"",
match = ["not", ".class:contain('span')"]
Create a new object { value: ":not(.class:contain('span'))"", type: "PSEUDO", matches: ["not", ".class:contain('span') "] }, and push the object into the tokens array. At this time, tokens have two elements, namely div and not selector.
SoFar variable deletes ":not(.class:contain('span'))". At this time, soFar=":eq(3)", after ending this for loop, return to the while loop again, the same way , to obtain the eq selector of the third element of tokens, the process is consistent with not, and I will not go into details here. The results of the final groups are as follows:
group[0][0] = {value: "div", type: "TAG", matches: ["div"] }
group[0][1] = {value: ":not(.class:contain('span'))", type: "PSEUDO", matches: ["not", ".class:contain(' span')"] }
group[0][2] = {value: ":eq(3)", type: "PSEUDO", matches: ["eq", "3"] }
Since parseOnly = undefined, tokenCache(selector, groups).slice(0) is executed. This statement pushes groups into the cache and returns its copy.
From this, all the parsing is completed. Some people may ask, the second element here is not parsed out. Yes, this needs to be parsed again in actual operation. Of course, if you can save the result of the valid selector in the cache when you just parsed "class:contain('span')):eq(3", you can avoid parsing again and improve the execution speed. But this It only improves the current running speed because during execution, when ".class:contain('span')" is submitted for analysis again, it will be stored in the cache.