See Horkeychen wrote the article "[WebKit] javascriptcore Analysis--Basic article (c) from script code to JIT compiled code implementation", written very well, deeply inspired. I would like to add some details such as how bytecode is generated, and so on, written by Horkey.
JSC's handling of JavaScript, in fact, is similar to WebKit's handling of CSS in many places, and it has so few parts:
(1) Lexical analysis--out of the word (Token);
(2) Grammar analysis, abstract syntax tree (ast:abstract Syntax trees);
(3) Traversal of the abstract syntax tree-generated bytecode (bytecode);
(4) Execute byte code with interpreter (Llint:low level interpreter);
(5) If the performance is not good enough to use baseline JIT compiled bytecode generation machine code, and then execute this machine code;
(6) If performance is not good enough, use DFG JIT recompile bytecode to generate better machine code, and then execute this machine code;
(7) Finally, if it is not good, the Llvm:low-virtual machine to compile the middle of the DFG code, to generate higher-optimized machines and execution. Next, I'll use a series of articles to describe this process.
Among them, steps 1, 2 is similar, 3, 4, 5 steps of the idea, CSS JIT is also a similar approach, please refer to [1]. Want to write JSC article, with rookie and Yugong Yishan Way, open the JSC tip of the iceberg.
This article mainly describes the details of lexical and syntactic parsing.
First, javascriptcore analysis of the lexical analyzer workflow
This explains the lexical and grammatical workflow:
The working process of the tokenizer is to constantly look for a word (Token) from a string, such as to find a continuous "true" string, creating a tokentrue. The process of working with the lexical device is as follows:
Javascriptcore/interpreter/interpreter.cpp:
Template <typename chartype>
[CPP] view plaincopy
Template <parsermode mode> tokentype Literalparser<chartype>::lexer::lex (literalparsertoken< chartype>& token)
{
while (M_ptr < m_end && Isjsonwhitespace (*m_ptr))
++m_ptr;
if (m_ptr >= m_end) {
Token.type = Tokend;
Token.start = Token.end = m_ptr;
return tokend;
}
Token.type = Tokerror;
Token.start = m_ptr;
Switch (*m_ptr) {
Case ' [':
Token.type = Toklbracket;
Token.end = ++m_ptr;
return toklbracket;
Case '] ':
Token.type = Tokrbracket;
Token.end = ++m_ptr;
return tokrbracket;
Case ' (':
Token.type = Toklparen;
Token.end = ++m_ptr;
return toklparen;
Case ') ':
Token.type = Tokrparen;
Token.end = ++m_ptr;
return tokrparen;
Case ', ':
Token.type = Tokcomma;
Token.end = ++m_ptr;
return tokcomma;
Case ': ':
Token.type = Tokcolon;
Token.end = ++m_ptr;
return Tokcolon;
Case ' "':
Return Lexstring<mode, ' "> (token);
Case ' t ':
if (m_end-m_ptr >= 4 && m_ptr[1] = = ' R ' && m_ptr[2] = = ' u ' && m_ptr[3] = = ' E ') {
M_ptr + = 4;
Token.type = Toktrue;
Token.end = m_ptr;
return toktrue;
}
Break
Case '-':
Case ' 0 ':
[CPP] view plaincopy
...
Case ' 9 ':
return Lexnumber (token);
}
if (M_ptr < m_end) {
if (*m_ptr = = '. ') {
Token.type = Tokdot;
Token.end = ++m_ptr;
return Tokdot;
}
if (*m_ptr = = ' = ') {
Token.type = tokassign;
Token.end = ++m_ptr;
return tokassign;
}
if (*m_ptr = = '; ') {
Token.type = Toksemi;
Token.end = ++m_ptr;
return tokassign;
}
if (Isasciialpha (*m_ptr) | | *m_ptr = = ' _ ' | | *m_ptr = = ' $ ')
return Lexidentifier (token);
if (*m_ptr = = ' \ ') {
return lexstring<mode, ' \ ' > (token);
}
}
M_lexerrormessage = String::format ("Unrecognized token '%c '", *m_ptr). Impl ();
return tokerror;
}
Through this process, a complete JSC World token is generated. Then, parse the syntax to generate an abstract syntax tree.
Javascriptcore/parser/parser.cpp:
[CPP] view plaincopy
<span style= "font-family:arial, Helvetica, Sans-serif;" >PassRefPtr<ParsedNode> parser<lexertype>::p arse (jsglobalobject* lexicalglobalobject, debugger* debugger, execstate* debuggerexecstate, jsobject** exception) </span>
[CPP] view plaincopy
{
ASSERT (Lexicalglobalobject);
ASSERT (Exception &&!*exception);
int errline;
Ustring errmsg;
if (parsednode::scopeisfunction)
M_lexer->setisreparsing ();
m_sourceelements = 0;
Errline =-1;
ErrMsg = Ustring ();
Ustring parseerror = Parseinner ();
。。。
}
Ustring parser<lexertype>::p Arseinner ()
[CPP] view plaincopy
{
Ustring parseerror = ustring ();
unsigned oldfunctioncachesize = M_functioncache? M_functioncache->bytesize (): 0;
[CPP] view plaincopy
Abstract Syntax Tree Builder:
Astbuilder context (const_cast<jsglobaldata*> (M_globaldata), const_cast<sourcecode*> (M_source));
if (m_lexer->isreparsing ())
m_statementdepth--;
Scoperef scope = CurrentScope ();
[CPP] view plaincopy
Start parsing a node of the build syntax tree:
sourceelements* sourceelements = parsesourceelements<checkforstrictmode> (context);
if (!sourceelements | |!consume (EOFTOK))
}
For example, according to the token type, JSC that the input token is a constant declaration, the syntax node is generated using the following template function, and then placed inside the Astbuilder:
[CPP] view plaincopy
Javascriptcore/bytecompiler/nodecodegen.cpp:
Template <typename lexertype>
Template <class treebuilder> treeconstdecllist parser<lexertype>::p arseconstdeclarationlist ( treebuilder& context)
{
Failiftrue (Strictmode ());
Treeconstdecllist constdecls = 0;
Treeconstdecllist tail = 0;
do {
Next ();
Matchorfail (IDENT);
Const identifier* name = M_token.m_data.ident;
Next ();
BOOL Hasinitializer = match (EQUAL);
Declarevariable (name);
Context.addvar (name, Declarationstacks::isconstant | (Hasinitializer?) declarationstacks::hasinitializer:0));
Treeexpression initializer = 0;
if (Hasinitializer) {
Next (treebuilder::D ontbuildstrings); Consume ' = '
initializer = parseassignmentexpression (context);
}
Tail = context.appendconstdecl (M_lexer->lastlinenumber (), tail, name, initializer);
if (!CONSTDECLS)
Constdecls = tail;
} while (Match (COMMA));
return constdecls;
}
Next, the bytecodegenerator::generate is called to generate bytecode, which is divided into section analysis. Let's take a look at the following syntax tree nodes from JavaScript to generate bytecode:
Javascriptcore/bytecompiler/nodecodegen.cpp:
registerid* Booleannode::emitbytecode (bytecodegenerator& generator, registerid* DST)
[CPP] view plaincopy
{
if (DST = = Generator.ignoredresult ())
return 0;
Return Generator.emitload (DST, m_value);
}
Here are the articles I am going to write:
First, JavaScriptCore's lexical analyzer workflow analysis;
Second, javascriptcore analysis of the parser work flow;
Three, JavaScriptCore of the bytecode generation process analysis;
Iv. Llint interpreter work flow analysis;
Five, the Baseline JIT compiler's work flow analysis;
Six, DFG JIT compiler workflow analysis;
Seven, LLVM virtual machine workflow analysis;
Viii. future prospects of javascriptcore;
Rough writing, poor expression, hope to write better.
The
first time to get blog update reminders, as well as more technical information sharing, welcome to the personal public platform: Programmer Interaction Alliance (coder_online), sweep the QR code below or search number Coder_online can pay attention to, we can communicate online.
Reference:
1 https://www.webkit.org/blog/3271/webkit-css-selector-jit-compiler/
2 http://blog.csdn.net/horkychen/article/details/8928578
Reprinted from: http://my.oschina.net/coderonline/blog/392971
[WebKit Core] JavaScriptCore Depth Analysis--Basic (a) byte code generation and construction of syntax tree