Parsing techniques and toolkits
1. the compiler generator Coco/R
URL: http://www.ssw.uni-linz.ac.at/Coco
Hanspeter mössenböck
,
Markus löberbauer
,
Albrecht wöhei
, University of Linz
Last Update: Jan 12,201 0
Documentation
|
Coco/R for C #
,
Java
,
C ++
,
F #
,
VB. NET
,
Oberon
,
Other ages
|
Contributions
|
Cookbook
|
Tools
|
Mailing List
|
Bugzilla
Coco/R is a compiler generator, which takes an attributed grammar of a source language and generates a generator and a parser for this language. the specified works as a deterministic finite automatic. the parser uses recursive descent. LL (1) conflicts can be resolved by a multi-symbol lookahead or by semantic checks. thus the class of accepted grammars is LL (K
) For an arbitrary
K
.
There are versions of Coco/R for different versions (see below ). the latest versions from the University of Linz are those for C #, Java and C ++, which can be downloaded from this site. an older (non-reentrant) version of Coco/R for C # and Java can be obtained from
Here
.
2. anlr
, Another tool for language recognition
URL: http://www.antlr.org/
What is anlr?
Anlr
, Another tool for language recognition, is a language tool that provides a framework for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions in a variety
Target ages
. Anlr provides excellent support for tree construction, tree walking, translation, error recovery, and error reporting. There are currently about
5,000
Anlr source downloads a month.
3. the lemon Parser Generator
URL: http://www.hwaci.com/sw/lemon/
The lemon Program is an lalr (1) parser generator. It takes a Context Free Grammar and converts it into a subroutine that will parse a file using that grammar.
Lemon is similar to the much more famous programs "YACC" and "bison". But Lemon is not compatible with either YACC or bison. There are several important differences:
- Lemon using a different grammar syntax which is less prone to programming errors.
- The parser generated by Lemon is both re-entrant and thread-safe.
- Lemon includes the concept of a non-terminal destructor, which makes it much easier to write a parser that does not leak memory.
The complete source code to the lemon Parser Generator is contained in two files. The file
Lemon. c
Is the Parser Generator program itself. A separate file
Lempar. c
Is the template for the parser subroutine that lemon generates.
Documentation
On Lemon is also available.
4. Flex: The Fast lexical analyzer
Flex is a tool for generating scanners. A signature, sometimes called a tokenizer, is a program which recognizes Lexical Patterns in text. the flex program reads user-specified input files, or its standard input if no file names are given, for a description of a temporary to generate. the description is in the form of pairs of regular expressions and C code, called rules. flex generates a C source file named, "Lex. YY. C ", which defines the function yylex (). the file "Lex. YY. C "can be compiled and linked to produce an executable. when the executable is run, it analyzes its input for occurrences of text matching the regular expressions for each rule. whenever it finds a match, it executes the corresponding C code.
5. YACC: yet another compiler-Compiler
Computer Program input generally has some structure; in fact, every computer program that does input can be thought of as defining an ''input language ''which it accepts. an input language may be as complex as a programming language, or as simple as a sequence of numbers. unfortunately, usual input facilities are limited, difficult to use, and often are lax about checking their inputs for validity.
YACC provides a general tool for describing the input to a computer program. the YACC User specifies the structures of his input, together with code to be invoked as each such structure is recognized. YACC turns such a specification into a subroutine that handles the input process; frequently, it is convenient and appropriate to have most of the flow of control in the user's application handled by this subroutine.
The input subroutine produced by YACC calla user-supplied routine to return the next basic input item. thus, the user can specify his input in terms of individual input characters, or in terms of higher level constructs such as names and numbers. the user-supplied routine may also handle idiomatic features such as comment and continuation conventions, which typically defy easy grammatical specification.
The Lex & YACC page
6,
Parsing techniques-A Practical Guide
URL: http://www.cs.vu.nl /~ Dick/pt2ed.html
Dick grune and ceriel J. H. jacbs
VU University Amsterdam, Amsterdam, The Netherlands
This is the new 662-page edition of parsing techniques-a practical guide. like its predecessor, it treats parsing in its own right, in greater depth than is found in most computer science and linguistics books. it offers a clear, accessible, and thorough discussion of partition different parsing techniques with their interrelations and applicabilities, including error recovery techniques. unlike most books, it treats (almost) All parsing methods, not just the popular ones, as can be seen from its table of contents. web site additions (see below) extend the number of pages to 801.
The new edition features: generalized deterministic parsers, non-canonical parsers, linear-time substring parsing, parsing as intersection, and parallel parsing, in addition to the expanded and updated text of the first edition. and there are hundreds of additional literature summaries!
...