[WebKit Core] JavaScriptCore Depth Analysis--Basic (a) byte code generation and construction of syntax tree

Last Update:2015-03-28 Source: Internet

Author: User

Tags ustring

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

See Horkeychen wrote the article "[WebKit] javascriptcore Analysis--Basic article (c) from script code to JIT compiled code implementation", written very well, deeply inspired. I would like to add some details such as how bytecode is generated, and so on, written by Horkey.

JSC's handling of JavaScript, in fact, is similar to WebKit's handling of CSS in many places, and it has so few parts:

(1) Lexical analysis--out of the word (Token);

(2) Grammar analysis, abstract syntax tree (ast:abstract Syntax trees);

(3) Traversal of the abstract syntax tree-generated bytecode (bytecode);

(4) Execute byte code with interpreter (Llint:low level interpreter);

(5) If the performance is not good enough to use baseline JIT compiled bytecode generation machine code, and then execute this machine code;

(6) If performance is not good enough, use DFG JIT recompile bytecode to generate better machine code, and then execute this machine code;

(7) Finally, if it is not good, the Llvm:low-virtual machine to compile the middle of the DFG code, to generate higher-optimized machines and execution. Next, I'll use a series of articles to describe this process.

Among them, steps 1, 2 is similar, 3, 4, 5 steps of the idea, CSS JIT is also a similar approach, please refer to [1]. Want to write JSC article, with rookie and Yugong Yishan Way, open the JSC tip of the iceberg.

This article mainly describes the details of lexical and syntactic parsing.

First, javascriptcore analysis of the lexical analyzer workflow

This explains the lexical and grammatical workflow:

The working process of the tokenizer is to constantly look for a word (Token) from a string, such as to find a continuous "true" string, creating a tokentrue. The process of working with the lexical device is as follows:

Javascriptcore/interpreter/interpreter.cpp:
Template <typename chartype>

[CPP] view plaincopy

Template <parsermode mode> tokentype Literalparser<chartype>::lexer::lex (literalparsertoken< chartype>& token)
{
while (M_ptr < m_end && Isjsonwhitespace (*m_ptr))
++m_ptr;
if (m_ptr >= m_end) {
Token.type = Tokend;
Token.start = Token.end = m_ptr;
return tokend;
}
Token.type = Tokerror;
Token.start = m_ptr;
Switch (*m_ptr) {
Case ' [':
Token.type = Toklbracket;
Token.end = ++m_ptr;
return toklbracket;
Case '] ':
Token.type = Tokrbracket;
Token.end = ++m_ptr;
return tokrbracket;
Case ' (':
Token.type = Toklparen;
Token.end = ++m_ptr;
return toklparen;
Case ') ':
Token.type = Tokrparen;
Token.end = ++m_ptr;
return tokrparen;
Case ', ':
Token.type = Tokcomma;
Token.end = ++m_ptr;
return tokcomma;
Case ': ':
Token.type = Tokcolon;
Token.end = ++m_ptr;
return Tokcolon;
Case ' "':
Return Lexstring<mode, ' "> (token);
Case ' t ':
if (m_end-m_ptr >= 4 && m_ptr[1] = = ' R ' && m_ptr[2] = = ' u ' && m_ptr[3] = = ' E ') {
M_ptr + = 4;
Token.type = Toktrue;
Token.end = m_ptr;
return toktrue;
}
Break
Case '-':
Case ' 0 ':

[CPP] view plaincopy

...
Case ' 9 ':
return Lexnumber (token);
}
if (M_ptr < m_end) {
if (*m_ptr = = '. ') {
Token.type = Tokdot;
Token.end = ++m_ptr;
return Tokdot;
}
if (*m_ptr = = ' = ') {
Token.type = tokassign;
Token.end = ++m_ptr;
return tokassign;
}
if (*m_ptr = = '; ') {
Token.type = Toksemi;
Token.end = ++m_ptr;
return tokassign;
}
if (Isasciialpha (*m_ptr) | | *m_ptr = = ' _ ' | | *m_ptr = = ' $ ')
return Lexidentifier (token);
if (*m_ptr = = ' \ ') {
return lexstring<mode, ' \ ' > (token);
}
}
M_lexerrormessage = String::format ("Unrecognized token '%c '", *m_ptr). Impl ();
return tokerror;
}

Through this process, a complete JSC World token is generated. Then, parse the syntax to generate an abstract syntax tree.

Javascriptcore/parser/parser.cpp:

[CPP] view plaincopy

<span style= "font-family:arial, Helvetica, Sans-serif;" >PassRefPtr<ParsedNode> parser<lexertype>::p arse (jsglobalobject* lexicalglobalobject, debugger* debugger, execstate* debuggerexecstate, jsobject** exception) </span>

[CPP] view plaincopy

{
ASSERT (Lexicalglobalobject);
ASSERT (Exception &&!*exception);
int errline;
Ustring errmsg;
if (parsednode::scopeisfunction)
M_lexer->setisreparsing ();
m_sourceelements = 0;
Errline =-1;
ErrMsg = Ustring ();
Ustring parseerror = Parseinner ();
。。。
｝

Ustring parser<lexertype>::p Arseinner ()

[CPP] view plaincopy

{
Ustring parseerror = ustring ();
unsigned oldfunctioncachesize = M_functioncache? M_functioncache->bytesize (): 0;

[CPP] view plaincopy

Abstract Syntax Tree Builder:
Astbuilder context (const_cast<jsglobaldata*> (M_globaldata), const_cast<sourcecode*> (M_source));
if (m_lexer->isreparsing ())
m_statementdepth--;
Scoperef scope = CurrentScope ();

[CPP] view plaincopy

Start parsing a node of the build syntax tree:
sourceelements* sourceelements = parsesourceelements<checkforstrictmode> (context);
if (!sourceelements | |!consume (EOFTOK))

｝
For example, according to the token type, JSC that the input token is a constant declaration, the syntax node is generated using the following template function, and then placed inside the Astbuilder:

[CPP] view plaincopy

Javascriptcore/bytecompiler/nodecodegen.cpp:
Template <typename lexertype>
Template <class treebuilder> treeconstdecllist parser<lexertype>::p arseconstdeclarationlist ( treebuilder& context)
{
Failiftrue (Strictmode ());
Treeconstdecllist constdecls = 0;
Treeconstdecllist tail = 0;
do {
Next ();
Matchorfail (IDENT);
Const identifier* name = M_token.m_data.ident;
Next ();
BOOL Hasinitializer = match (EQUAL);
Declarevariable (name);
Context.addvar (name, Declarationstacks::isconstant | (Hasinitializer?) declarationstacks::hasinitializer:0));
Treeexpression initializer = 0;
if (Hasinitializer) {
Next (treebuilder::D ontbuildstrings); Consume ' = '
initializer = parseassignmentexpression (context);
}
Tail = context.appendconstdecl (M_lexer->lastlinenumber (), tail, name, initializer);
if (!CONSTDECLS)
Constdecls = tail;
} while (Match (COMMA));
return constdecls;
}

Next, the bytecodegenerator::generate is called to generate bytecode, which is divided into section analysis. Let's take a look at the following syntax tree nodes from JavaScript to generate bytecode:

Javascriptcore/bytecompiler/nodecodegen.cpp:
registerid* Booleannode::emitbytecode (bytecodegenerator& generator, registerid* DST)

[CPP] view plaincopy

{
if (DST = = Generator.ignoredresult ())
return 0;
Return Generator.emitload (DST, m_value);
}

Here are the articles I am going to write:

First, JavaScriptCore's lexical analyzer workflow analysis;

Second, javascriptcore analysis of the parser work flow;

Three, JavaScriptCore of the bytecode generation process analysis;

Iv. Llint interpreter work flow analysis;

Five, the Baseline JIT compiler's work flow analysis;

Six, DFG JIT compiler workflow analysis;

Seven, LLVM virtual machine workflow analysis;

Viii. future prospects of javascriptcore;

Rough writing, poor expression, hope to write better.

The
first time to get blog update reminders, as well as more technical information sharing, welcome to the personal public platform: Programmer Interaction Alliance (coder_online), sweep the QR code below or search number Coder_online can pay attention to, we can communicate online.

Reference:

1 https://www.webkit.org/blog/3271/webkit-css-selector-jit-compiler/

2 http://blog.csdn.net/horkychen/article/details/8928578

Reprinted from: http://my.oschina.net/coderonline/blog/392971

[WebKit Core] JavaScriptCore Depth Analysis--Basic (a) byte code generation and construction of syntax tree

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More