C Language Lexical analyzer

Source: Internet
Author: User
Tags constant definition integer numbers lexer

Overview

Lexical analysis is the first step in the compilation phase. The task at this stage is to read the source program from left to right, one character at a time, to scan the character stream that makes up the source program and then recognize the word (also called the word symbol or symbol) according to the word-building rules. The lexical Analysis program implements this task. Lexical analysis programs can be generated automatically using tools such as Lex.

This project implements a simple C language lexical analyzer. is now hosted on the [email protected] website, home: http://git.oschina.net/kinegratii/Lexer

Project Features
    • Support for decimal numbers, octal numbers, identifiers, keywords, operators, separators, and many other morphemes

    • Support file import and code writing two kinds of input methods

    • The algorithm and UI implement a low-coupling between the two through a specific interface

Project structure
lexer-- com.kinegratii.lexer   Main package      |--- Analyzer.java        parser and its callback interface      |--- Lexer.java           Project Startup class      |--- MainFrame.java       Interface class      |--- SoftwareInfo.java   software information constant definition  --  com.kinegratii.token  Word      |--- DoubleToken.java        floating point      |--- DotToken.java           Separators      |--- IdentifiterToken.java  markers         |--- IntegerToken.java      integer numbers       |--- ReservedToken.java     keywords-- com.kinegatii.utils   toolkit        |--- BareBonesBrowserLaunch.java  invoke browser 

Project

Lexical unit sequences

Symbol table

Project development

This project is a practical project of compiling the course, the preliminary code completed in April 2012, also experienced a few minor changes, the version to the 1.2.4 (then the version is more casual). The v1.3.0 version of this was the first time the overall refactoring was made.

v1.3.0 2014-09-24

    • Refactoring the entire project, partitioning packages and classes according to responsibilities, and implementing low-coupling algorithms and UIs

    • Removed the Language Transformation section

The so-called language transformation part of this parser is some of the custom rules, such as the continuous underline identifier should only keep one, such as "a__b" = "a_b" and so on. These are not part of the standard analyzer, the reason why there is such a thing, is to prevent full-text replication, the annual language transformation is not the same, even if the previous code to change it, it will be familiar with the entire project code.

Follow-up plan
    • Support for configuring the analyzer to implement a custom Language transformation section, primarily at the code level. The basic requirement is a configurable, universal interface.

    • The current processing callback interface is also relatively simple, you can consider exposing some of the parser's data in the interface.

C Language Lexical analyzer

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.