Parsing Converter 3: Handwritten php to python compiler lexical section

Source: Internet
Author: User
Tags php compiler
This article resolves Converter 3: handwritten php to python compiler lexical section

For a moment kungfu, I naturally want to get a big guy and turn the whole PHP program into Python. No more than templates, you can use regular matching lazy, this time not write a PHP compiler is not.

Internet search, found that most of the python to xxx transpile are directly based on the AST, omitting the most important tokenizer,parser. Write a visitor directly. Otherwise it is based on a generator like ANTLR, to get a lot of code, look annoying.

Since we don't want to be a laborer, I'll try it and write a PHP compiler manually. Divided into three parts to achieve a tokenizer,parser,visitor.

"Dragon book" "Tiger Book" as a reference, carefully learned a back to PHP, do not learn not to know, the original PHP has so many features, do a compiler really tiring.

Lexical part is very simple, is an automaton. The design of a structure to store automata, and then simply rough in the robot programming, also forget what performance, is a affair.

The writing is very fast, debugging is not very smooth, but I will not say, ha

The self-motive is not complicated, send up everybody to see, please correct me.


Self.statemachine = {' current ': {' state ': ' Default ', ' content ': ', ' line ': 0}, ' de Fault ': [{' Name ': ' Open ', ' Next ': ' php ', ' Extra ': 0, ' start ': 0, ' End ': 0, ' cache ': ', ' t                 Oken ': R ' <\? '}, {' Name ': ' Open ', ' Next ': ' php ', ' Extra ': 0, ' start ': 0, ' End ': 0, ' cache ': ',                 ' token ': R ' <\?php '}], ' php ': [{' name ': ' Close ', ' Next ': ' Default ', ' Extra ': 0, ' token ': R ' \?> ', ' Start ': 0, ' End ': 0, ' cache ': '}, {' name ': ' Lnum ', ' Next ': ', ' Extra ': 0, ' Start ': 0, ' End ': 0, ' cache ': ', ' token ': R ' [0-9]+ '}, {' name ': ' Dnum ', ' Next ': ', ' extra ': 0, ' start ': 0, ' End ': 0, ' cache ': ', ' token ': R ' ([0-9]*\.[ 0-9]+) | ([0-9]+\.                 [0-9]*) '}, {' name ': ' Exponent ', ' next ': ', ' Extra ': 0, ' start ': 0, ' End ': 0, ' cache ': ', ' token ': R ' ([0-9]+| ( [0-9]*\. [0-9]+) | ([0-9]+\. [0-9]*)) [ee][+-]? [0-9]+) '}, {' name ': ' Hnum ', ' Next ': ', ' Extra ': 0, ' start ': 0, ' End ': 0, ' cache ': ', ' Tok                 En ': R ' 0x[0-9a-fa-f]+ '}, {' name ': ' Bnum ', ' Next ': ', ' Extra ': 0, ' start ': 0, ' End ': 0, ' cache ': ', ' token ': R ' 0b[01]+ '}, {' name ': ' label ', ' Next ': ', ' Extra ': 0, ' start ': 0, ' End ': 0, ' cache ': ' , ' token ': R ' [a-za-z_\x7f-\xff][a-za-z0-9_\x7f-\xff]* '}, {' name ': ' Comment ', ' next ': ' Comme Ntline ', ' Extra ': 1, ' token ': R '//', ' Start ': 0, ' End ': 0, ' cache ': '}, {' name ': ' Comment ' , ' Next ': ' Commentline ', ' Extra ': 1, ' token ': R ' # ', ' Start ': 0, ' End ': 0, ' cache ': '}, {' N                Ame ': ' comment ', ' next ': ' comment ', ' extra ': 1, ' token ': R '/\* ', ' Start ': 0, ' End ': 0, ' cache ': '},  {' Name ': ' String ', ' Next ': ' String1 ', ' extra ': 1, ' token ': R ' \ ', ' Start ': 0, ' End ': 0, ' cache ':        ''},        {' Name ': ' String ', ' Next ': ' string2 ', ' Extra ': 1, ' token ': R ' "', ' Start ': 0, ' End ': 0, ' cache ': ' }, {' name ': ' symbol ', ' Next ': ', ' Extra ': 0, ' start ': 0, ' End ': 0, ' cache ': ', ' token ': R ' [\\\{\};:,\.\[\]\ (\) \|\^&\+-/\*=%!~$<>\?@] '}], ' string1 ': [{' Name ': ' String ', ' next '  : ' php ', ' Extra ': 0, ' token ': R ' \ ', ' Start ': 0, ' End ': 0, ' cache ': '}, {' Name ': ' String ', ' Next ': ' Escape1 ', ' Extra ': 1, ' token ': R ' \ \ ', ' Start ': 0, ' End ': 0, ' cache ': '}, {' Name ' : ' String ', ' Next ': ', ' Extra ': 1, ' token ': ' R ', ' Start ': 0, ' End ': 0, ' cache ': '} ', ' Escape1  ': [{' Name ': ' String ', ' Next ': ' String1 ', ' extra ': 1, ' token ': R '. ', ' start ': 0, ' End ': 0, ' Cache ': ' '} ', ' string2 ': [{' Name ': ' String ', ' Next ': ' php ', ' Extra ': 0, ' toke N ': R ' \ ', ' Start ': 0, 'End ': 0, ' cache ': '}, {' Name ': ' String ', ' Next ': ' Escape2 ', ' Extra ': 1, ' token ': R ' \ \ ', ' s Tart ': 0, ' End ': 0, ' cache ': '}, {' Name ': ' String ', ' Next ': ', ' Extra ': 1, ' token ': R ',  ' Start ': 0, ' End ': 0, ' cache ': '} ', ' escape2 ': [{' Name ': ' String ', ' Next ': ' string2 ', ' extra ': 1, ' token ': R '. ', ' start ': 0, ' End ': 0, ' cache ': '} ', ' Commentline ': [{' Name ']                : ' Comment ', ' Next ': ' php ', ' Extra ': 0, ' token ': R ' (\r|\n|\r\n) ', ' Start ': 0, ' End ': 0, ' cache ': '}, {' name ': ' Comment ', ' Next ': ' php ', ' Extra ': 0, ' token ': R ', ' Start ': 0, ' End ': 0, ' cache ': '} ], ' comment ': [{' name ': ' Comment ', ' Next ': ' php ', ' Extra ': 0, ' token ': R ' \*/', ' Start ': 0, ' End ': 0, ' cache ': '}, {' name ': ' Comment ', ' Next ': ', ' Extra ': 1, ' token ': R ' ', ' Start ': 0, ' End ': 0, ' Cache ': '} '} 
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.