Turn: Design and Implementation of PL/0 language lexical and syntax analysis system

Source: Internet
Author: User

Design and Implementation of PL/0 language lexical and syntax analysis system

Author: Tao shanwen
Information and Computer Science, Nanjing University of Aeronautics and Astronautics

Download source code

Abstract: This article introduces the design and implementation of a PL/0 language lexical and syntax analysis system.
Keywords: cyclic branch recursive descent pipeline output redirection

The current compilation system is integrated development environment (IDE) and compiler independent implementation, they communicate through the pipeline, the system also uses this method. First, I will give the grammar of PL/0 in this article:

BNF description in PL/0 language (Extended barx paradigm Notation)

<Prog> → Program <ID>; <block> → [<condecl>] [<vardecl>] [<proc>] <body> <condecl> → const <const> {, <const >}< const >→ <ID >:= <integer> <vardecl> → var <ID> {, <ID >}< proc >→ procedure <ID> (<ID >{, <ID >}); <block> {; <proc >}< body >→ begin <statement >{; <statement >}end <statement >→ <ID>: = <exp> | if <lexp> then <Statement> [else <Statement>] | while <lexp> DO <Statement> | call <ID> [(<exp> {, <exp >})] | <body> | read (<ID >{, <ID >}) | write (<exp >{, <exp> }) <lexp> → <exp> <lop> <exp> | odd <exp> → [+ |-] <term >{< AOP> <term >}< term> >→< factor >{< MOP> <factor >}< factor >→ <ID >|< integer >|( <exp>) <lop> → = | <>|||<=|>|>=< AOP >→+ |-<MOP> → * |/<ID> → L {L | d} (note: l represents a letter) <integer> → D {d}

Note:

<Prog>: Program; <block>: Block, program body; <condecl>: constant description; <const>: constant; <vardecl>: Variable description; <proc>: sub-program; <body>: Compound statement; <Statement>: Statement; <exp>: expression; <lexp>: condition; <term>: item; <factor>: factor; <AOP>: Addition operator; <MOP>: Multiplication operator; <lop>: relational operator odd: Judge the parity of an expression.

Let's take a look at the design and implementation of lexical and syntax analyzer. Lexical analysis is implemented by means of cyclic branches, and syntax analysis is implemented by recursive descent. Their program flowchart is as follows:

Next we will implement these two analyzers. The two analyzers are implemented using a class ccompiler. The class is defined as follows:
// Compilation class

Class ccompiler {public: ccompiler (); Virtual ~ Ccompiler (); Public: void compile (char * szfile); // compile, public interface vector <syntaxerr> getsyntaxerr () {return m_vectorsyntaxerr ;}; // get syntax error protected: bool lexanalysis (char * szstr); // lexical analysis bool isoprsym (char * szstr); // whether it is the operator bool isbndsym (char * szstr ); // whether it is the bool iskeyword (char * szstr); // whether it is the keyword bool isinsymboltab (char * szstr ); // whether the symbol table char * jumpnomatterchar (char * szstr); // skip spaces, press enter, line break, tabvoid outsymboltab (char * szfile ); // output symbol table to file void syntaxanalysis (); // syntax analysis void syntaxanalysis_prog (); bool syntaxanalysis_mop (); bool syntaxanalysis_integer (); bool syntaxanalysis_aop (); bool analyze (); int syntaxanalysis_id (); int syntaxanalysis_block (); int syntaxanalysis_body (); int syntaxanalysis_factor (); int syntaxanalysis_term (); int analyze (); int syntaxanalysis_statement (); int syntaxanalysis_const (); int minute (); protected: int m_ivecotrsymbolsize; // symbol table size int m_icurpointer; // vector <lexpropertyvs> m_vectorsymbol; // symbol table vector <syntaxerr> m_vectorsyntaxerr; // syntax error code };

The function bool lexanalysis (char * szstr) is used to perform lexical analysis on the input string szstr using the cyclic branch method. The analyzed symbols are placed in the symbol table m_vectorsymbol, the symbol table is represented by the data structure of the vector. After the symbol table is obtained through lexical analysis, it enters the syntax analysis stage. The syntax analysis is completed by the void syntaxanalysis () function. The following functions are recursive subprograms corresponding to non-terminologies.

bool SyntaxAnalysis_Mop();bool SyntaxAnalysis_Integer();bool SyntaxAnalysis_Aop();bool SyntaxAnalysis_Lop();int SyntaxAnalysis_Id();int SyntaxAnalysis_Block(); int SyntaxAnalysis_Body();int SyntaxAnalysis_Factor();int SyntaxAnalysis_Term();int SyntaxAnalysis_Lexp();int SyntaxAnalysis_Exp();int SyntaxAnalysis_Statement();int SyntaxAnalysis_Const();int SyntaxAnalysis_Proc();int SyntaxAnalysis_Vardecl();int SyntaxAnalysis_Condecl();

I have introduced the core design and implementation of lexical and syntax analysis. Next I will briefly introduce the implementation of IDE and the communication between IDE and analysis core. Pipeline communication is used between the IDE and the analysis core of the system. The Code is as follows:

DWORD dwthreadid;: createthread (0, 0, compilethread, this, 0, & dwthreadid); // create a process

Call the process function after a process is created,
// Process functions

DWORD WINAPI CompileThread(LPVOID pParam){CCompileSysView *pView=(CCompileSysView*)pParam;pView->GetCompileResult();return 0;}      

The getcompileresult () function of the process function call class gets the output result of the analysis core. The implementation of this function is as follows:

Void ccompilesysview: getcompileresult () {security_attributes SA; handle hread, hwrite; cstring strfile; cstring Strout; strfile. format (".. // PL // pl.exe "); // specify the path of the analysis core program // The current file is passed as a parameter to the analysis core program to prevent the file name from containing spaces, therefore, double quotation marks "" are used to enclose the file name in strfile = strfile + (char) 34 + m_szcurfile + (char) 34; SA. Nlength = sizeof (security_attributes); SA. Lpsecuritydescriptor = NULL; SA. Binherithandle = true; If (! Createpipe (& hread, & hwrite, & SA, 0) // create a pipeline for communication {MessageBox ("error on createpipe ()"); return;} startupinfo Si; process_information PI; SI. CB = sizeof (startupinfo); getstartupinfo (& Si); SI. Hstderror = hwrite; SI. Hstdoutput = hwrite; // The output is redirected to the file Si. Wshowwindow = sw_hide; SI. Dwflags = startf_useshowwindow | startf_usestdhandles; // create a process to start the analysis core program if (! CreateProcess (null, (lpstr) (lpctstr) strfile, null, null, true, null, & Si, & PI) {MessageBox ("error on CreateProcess () "); return;} closehandle (hwrite); char buffer [4096] = {0}; DWORD bytesread; while (true) {If (! Readfile (hread, buffer, 4095, & bytesread, null) break; Strout + = buffer; m_pwndoutbar-> setcolorrichedittext (Strout ); // show the output result to sleep (500 );}}

This completes the design and implementation of the entire analysis system. Next let's take a look at how the entire system runs. Let's take a look at the running interface of this system:

After the program is run, set the path of the analysis program on the page that appears. Set the path to the menu IDE environment (I). The dialog box shown in is displayed:

Enter the path of the analyzer in the editing box (the default analyzer and source file are in the same directory ). After the settings are complete, you can enter the code in the code editing area, or click "open" to open the file, and then click "start" on the toolbar (or press the shortcut key F7) for analysis, after analysis, the lexical analysis result is displayed in the "analysis result display area", and the lexical and syntax analysis information is displayed in the "output information display area.

Known bugs
Due to the time relationship, the following bugs cannot be debugged by myself. If any experts debug the bugs, I would like to inform you.

  1. Pl.exe has a large amount of memory leakage, but I used the following code to release the memory in the ccompiler destructor. I do not know why the error occurs:
    Ccompiler ::~ Ccompiler ()
    {
    // The following code releases the memory and does not know why the error occurs.
    // For (INT I = 0; I <m_ivecotrsymbolsize; I ++)
    // Delete m_vectorsymbol [I]. szstr;
    }
  2. When we test pl.exe using test3.pas in the test source file, we do not know why there is no error in the debug state, but an error in the release state.
  3. When test3.pas in the test source file is used to test ide.exe, the output information bar displays more information that has already been displayed. If you do not know why, when you estimate the read pipeline information, I read the original information again.
  4. Display of rows and columns in the source code editing area: currently, only rows are displayed and Columns cannot be displayed.
  5. Work interval bar: Right-click and select expand ". Sometimes the expected effect cannot be displayed. When you right-click the desired effect, you must first click it with the left button to get the expected effect. The reason is: The getselecteditem () function () the selected item must be left-clicked first. You do not know how to solve this problem.
  6. When double-clicking the most child node of a workspace, the words corresponding to the node should enter the user's video range.
  7. A piece of code in the function getcompileresult () in the cideview class does not run in the release version. An error occurs in the debug version. The Code is as follows:
    Pdoc-> setpathname (strfile, 1 );
    Pdoc-> setmodifiedflag (0 );
    Pdoc-> onsavedocument (lpstr) (lpcstr) strfile); // save the file first
    STR = pdoc-> gettitle ();
    Pdoc-> settitle (STR );
    If (Str. Right (1) = "*")
    {
    STR = Str. Left (Str. getlength ()-1 );
    Pdoc-> settitle (STR );
    }
    Updatewindow ();

    This code is used to save the file and remove the asterisks marked as unsaved in the window before starting the analysis program.

References

  1. Chen huowang, programming language compilation principle, National Defense Industry Press, 2001.1
  2. Wang Yonggang, writing your own IDE

Contact information of the author

  • Http://home.pudn.com/ahei
  • Http://AIfan.54sc.com
  • QQ: 8261525 computer game QQ group: 5620663
  • Ahei080210114@hotmail.com
  • Ahei0802@126.com

Latest comment [Post comment] [Article Contribution] View All comments and recommend them to friends
Well, I have helped you solve your program problem :)
The reason is that you did not consider the space occupied by '/0' when allocating memory to the string. It is a low-level error !!
Change all the new char [XXX] in the program to new char [xxx + 1]. Then, the memory will be released without errors :)

There is also a function bool ccompiler: lexanalysis (char * szstr)
...
Sztemp = new char [nlen + 1];
Sztemp is not released !!!

Then there is no leakage problem. (Wuhuaqiang was published at 13:44:00)
 
After debugging your code, you must add the release memory statement. You cannot release the memory because of an error :)
An error occurs because the Code has a buffer overflow vulnerability. The program has multiple arrays for out-of-bounds access. If the code is tampered with, it does not belong to your memory zone. When the memory is released, the system will check these items. Of course, this will cause an error.
The basic skills and carefulness of coding need to be improved. (Wuhuaqiang was published at 13:30:00)
 
Probably browsed your code and the block that releases the memory should be changed
Ccompiler ::~ Ccompiler ()
{
// The following code releases the memory and does not know why the error occurs.
// For (INT I = 0; I <m_ivecotrsymbolsize; I ++)
// Delete [] m_vectorsymbol [I]. szstr;
}
It seems that this simple type is easier to allocate and release with malloc and free (wuhuaqiang was published on 13:05:00)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.