LALR Grammar Analyzer reads EBNF Notation

"Paul Mann" <paul@parsetec.com>
12 Apr 2001 02:41:46 -0400

          From comp.compilers

Related articles
LALR Grammar Analyzer reads EBNF Notation paul@parsetec.com (Paul Mann) (2001-04-12)
Re: LALR Grammar Analyzer reads EBNF Notation nasan@slayers.de (Jan Neumueller) (2001-04-30)
| List of all articles for this month |

From: "Paul Mann" <paul@parsetec.com>
Newsgroups: comp.compilers
Date: 12 Apr 2001 02:41:46 -0400
Organization: ParseTEC
Keywords: parse, available
Posted-Date: 12 Apr 2001 02:41:46 EDT

I have released an:


LALR grammar analyzer which reads EBNF notation.


This is freeware available at:
www.parsetec.com/products.html


Useful features are:


1. It creates an HTML version of your grammar which
allows clicking on a nonterminal to go to its definition.
It has syntax coloring for nonterminals (blue),
keywords, operators, punctuators (black), and other
terminals (red).


2. It creates HTML conflict trace information with links
which you can click on and go to the the offending rules
in #1 above.


3. It creates the LALR finite state machine in HTML.
If you click on a state in #2 above, it takes you to the
state in this file.


4. It inlcudes an LALR grammar for ANSI-C, but I do
not know how complete it is.


5. It includes a program that translates YACC grammar
notation into a BNF format for use with this analyzer.


---------------------------------------------------------------------------


The EBNF notation is briefly described on the website
and more completely in the 36-page users manual
provided in the download.


It is the most concise EBNF notation I have seen,
more concise that ISO EBNF.


For example: [x | y | z] /',' ...


indicates an optional comma-separated list of
x or y or z. All of the following would be valid:


/* nothing */
x
x, x, x
y, z, x, y, z, z, y, x


I would like to see a more concise notation for this
if anyone has one.


Other EBNF features are:


1. Automatically handling the typedef in C with
dynamic-terminal symbols (e.g. {typedef} ).


2. {keyword} symbol to indicate all the keywords
in a grammar. Allows easier experimentation
with grammars for PL/I, FORTRAN and Visual
Basic (i.e. the named-parameters in VB) which
allow keywords to be used as identifiers.


3. <error> symbol to provide a way to ignore
symbols that are not part of the language, like
ignoring JavaScript within HTML.


-----------------------------------------------------------------------


It will determine whether your grammar is ambiguous
or not -- within the LALR(1) context. This practical
approach is the same as YACC does and many
other tools. This works quite well for many programming
languages. It handles the dangling else problem just
fine (See: www.parsetec.com/grm/AnsiC_con.html)


If there is sufficient interest, I could have the product
output YACC style grammars.


It is only a grammar analyzer, however, a companion
product, an LALR parser generator Educational
Version is also available as freeware on the website.


The parser generator includes source code in C++
for a compiler front end (without intermediate code
generation functions). It reads input, does lexing,
parsing, builds a symbol table, creates an abstract-
syntax tree and walks the tree for printing or
generating intermediate code.


Feedback is welcome.


Paul Mann
www.parsetec.com


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.