[WebKit Core] JavaScriptCore Depth Analysis--Basic (a) byte code generation and construction of syntax tree

Source: Internet
Author: User
Tags ustring

See Horkeychen wrote the article "[WebKit] javascriptcore Analysis--Basic article (c) from script code to JIT compiled code implementation", written very well, deeply inspired. I would like to add some details such as how bytecode is generated, and so on, written by Horkey.

JSC's handling of JavaScript, in fact, is similar to WebKit's handling of CSS in many places, and it has so few parts:

(1) Lexical analysis--out of the word (Token);

(2) Grammar analysis, abstract syntax tree (ast:abstract Syntax trees);

(3) Traversal of the abstract syntax tree-generated bytecode (bytecode);

(4) Execute byte code with interpreter (Llint:low level interpreter);

(5) If the performance is not good enough to use baseline JIT compiled bytecode generation machine code, and then execute this machine code;

(6) If performance is not good enough, use DFG JIT recompile bytecode to generate better machine code, and then execute this machine code;

(7) Finally, if it is not good, the Llvm:low-virtual machine to compile the middle of the DFG code, to generate higher-optimized machines and execution. Next, I'll use a series of articles to describe this process.

Among them, steps 1, 2 is similar, 3, 4, 5 steps of the idea, CSS JIT is also a similar approach, please refer to [1]. Want to write JSC article, with rookie and Yugong Yishan Way, open the JSC tip of the iceberg.

This article mainly describes the details of lexical and syntactic parsing.

First, javascriptcore analysis of the lexical analyzer workflow

This explains the lexical and grammatical workflow:

The working process of the tokenizer is to constantly look for a word (Token) from a string, such as to find a continuous "true" string, creating a tokentrue. The process of working with the lexical device is as follows:

Javascriptcore/interpreter/interpreter.cpp:
Template <typename chartype>

[CPP] view plaincopy

  1. Template <parsermode mode> tokentype Literalparser<chartype>::lexer::lex (literalparsertoken< chartype>& token)

  2. {

  3. while (M_ptr < m_end && Isjsonwhitespace (*m_ptr))

  4. ++m_ptr;

  5. if (m_ptr >= m_end) {

  6. Token.type = Tokend;

  7. Token.start = Token.end = m_ptr;

  8. return tokend;

  9. }

  10. Token.type = Tokerror;

  11. Token.start = m_ptr;

  12. Switch (*m_ptr) {

  13. Case ' [':

  14. Token.type = Toklbracket;

  15. Token.end = ++m_ptr;

  16. return toklbracket;

  17. Case '] ':

  18. Token.type = Tokrbracket;

  19. Token.end = ++m_ptr;

  20. return tokrbracket;

  21. Case ' (':

  22. Token.type = Toklparen;

  23. Token.end = ++m_ptr;

  24. return toklparen;

  25. Case ') ':

  26. Token.type = Tokrparen;

  27. Token.end = ++m_ptr;

  28. return tokrparen;

  29. Case ', ':

  30. Token.type = Tokcomma;

  31. Token.end = ++m_ptr;

  32. return tokcomma;

  33. Case ': ':

  34. Token.type = Tokcolon;

  35. Token.end = ++m_ptr;

  36. return Tokcolon;

  37. Case ' "':

  38. Return Lexstring<mode, ' "> (token);

  39. Case ' t ':

  40. if (m_end-m_ptr >= 4 && m_ptr[1] = = ' R ' && m_ptr[2] = = ' u ' && m_ptr[3] = = ' E ') {

  41. M_ptr + = 4;

  42. Token.type = Toktrue;

  43. Token.end = m_ptr;

  44. return toktrue;

  45. }

  46. Break

  47. Case '-':

  48. Case ' 0 ':

[CPP] view plaincopy

  1. ...

  2. Case ' 9 ':

  3. return Lexnumber (token);

  4. }

  5. if (M_ptr < m_end) {

  6. if (*m_ptr = = '. ') {

  7. Token.type = Tokdot;

  8. Token.end = ++m_ptr;

  9. return Tokdot;

  10. }

  11. if (*m_ptr = = ' = ') {

  12. Token.type = tokassign;

  13. Token.end = ++m_ptr;

  14. return tokassign;

  15. }

  16. if (*m_ptr = = '; ') {

  17. Token.type = Toksemi;

  18. Token.end = ++m_ptr;

  19. return tokassign;

  20. }

  21. if (Isasciialpha (*m_ptr) | | *m_ptr = = ' _ ' | | *m_ptr = = ' $ ')

  22. return Lexidentifier (token);

  23. if (*m_ptr = = ' \ ') {

  24. return lexstring<mode, ' \ ' > (token);

  25. }

  26. }

  27. M_lexerrormessage = String::format ("Unrecognized token '%c '", *m_ptr). Impl ();

  28. return tokerror;

  29. }

Through this process, a complete JSC World token is generated. Then, parse the syntax to generate an abstract syntax tree.

Javascriptcore/parser/parser.cpp:

[CPP] view plaincopy

    1. <span style= "font-family:arial, Helvetica, Sans-serif;" >PassRefPtr<ParsedNode> parser<lexertype>::p arse (jsglobalobject* lexicalglobalobject, debugger* debugger, execstate* debuggerexecstate, jsobject** exception) </span>

[CPP] view plaincopy

  1. {

  2. ASSERT (Lexicalglobalobject);

  3. ASSERT (Exception &&!*exception);

  4. int errline;

  5. Ustring errmsg;

  6. if (parsednode::scopeisfunction)

  7. M_lexer->setisreparsing ();

  8. m_sourceelements = 0;

  9. Errline =-1;

  10. ErrMsg = Ustring ();

  11. Ustring parseerror = Parseinner ();

  12. 。。。


Ustring parser<lexertype>::p Arseinner ()

[CPP] view plaincopy

    1. {

    2. Ustring parseerror = ustring ();

    3. unsigned oldfunctioncachesize = M_functioncache? M_functioncache->bytesize (): 0;

[CPP] view plaincopy

    1. Abstract Syntax Tree Builder:

    2. Astbuilder context (const_cast<jsglobaldata*> (M_globaldata), const_cast<sourcecode*> (M_source));

    3. if (m_lexer->isreparsing ())

    4. m_statementdepth--;

    5. Scoperef scope = CurrentScope ();

[CPP] view plaincopy

    1. Start parsing a node of the build syntax tree:

    2. sourceelements* sourceelements = parsesourceelements<checkforstrictmode> (context);

    3. if (!sourceelements | |!consume (EOFTOK))


For example, according to the token type, JSC that the input token is a constant declaration, the syntax node is generated using the following template function, and then placed inside the Astbuilder:

[CPP] view plaincopy

  1. Javascriptcore/bytecompiler/nodecodegen.cpp:

  2. Template <typename lexertype>

  3. Template <class treebuilder> treeconstdecllist parser<lexertype>::p arseconstdeclarationlist ( treebuilder& context)

  4. {

  5. Failiftrue (Strictmode ());

  6. Treeconstdecllist constdecls = 0;

  7. Treeconstdecllist tail = 0;

  8. do {

  9. Next ();

  10. Matchorfail (IDENT);

  11. Const identifier* name = M_token.m_data.ident;

  12. Next ();

  13. BOOL Hasinitializer = match (EQUAL);

  14. Declarevariable (name);

  15. Context.addvar (name, Declarationstacks::isconstant | (Hasinitializer?) declarationstacks::hasinitializer:0));

  16. Treeexpression initializer = 0;

  17. if (Hasinitializer) {

  18. Next (treebuilder::D ontbuildstrings); Consume ' = '

  19. initializer = parseassignmentexpression (context);

  20. }

  21. Tail = context.appendconstdecl (M_lexer->lastlinenumber (), tail, name, initializer);

  22. if (!CONSTDECLS)

  23. Constdecls = tail;

  24. } while (Match (COMMA));

  25. return constdecls;

  26. }

Next, the bytecodegenerator::generate is called to generate bytecode, which is divided into section analysis. Let's take a look at the following syntax tree nodes from JavaScript to generate bytecode:

Javascriptcore/bytecompiler/nodecodegen.cpp:
registerid* Booleannode::emitbytecode (bytecodegenerator& generator, registerid* DST)

[CPP] view plaincopy

    1. {

    2. if (DST = = Generator.ignoredresult ())

    3. return 0;

    4. Return Generator.emitload (DST, m_value);

    5. }

Here are the articles I am going to write:

First, JavaScriptCore's lexical analyzer workflow analysis;

Second, javascriptcore analysis of the parser work flow;

Three, JavaScriptCore of the bytecode generation process analysis;

Iv. Llint interpreter work flow analysis;

Five, the Baseline JIT compiler's work flow analysis;

Six, DFG JIT compiler workflow analysis;

Seven, LLVM virtual machine workflow analysis;

Viii. future prospects of javascriptcore;

Rough writing, poor expression, hope to write better.

The
first time to get blog update reminders, as well as more technical information sharing, welcome to the personal public platform: Programmer Interaction Alliance (coder_online), sweep the QR code below or search number Coder_online can pay attention to, we can communicate online.

Reference:

1 https://www.webkit.org/blog/3271/webkit-css-selector-jit-compiler/

2 http://blog.csdn.net/horkychen/article/details/8928578

Reprinted from: http://my.oschina.net/coderonline/blog/392971

[WebKit Core] JavaScriptCore Depth Analysis--Basic (a) byte code generation and construction of syntax tree

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.