|Are lalr parse-tables viable? email@example.com (craig) (2004-06-11)|
|Re: Are lalr parse-tables viable? firstname.lastname@example.org (2004-06-12)|
|Re: Are lalr parse-tables viable? email@example.com (A Johnstone) (2004-06-13)|
|Re: Are lalr parse-tables viable? cfc@shell01.TheWorld.com (Chris F Clark) (2004-06-13)|
|Re: Are lalr parse-tables viable? firstname.lastname@example.org (craig) (2004-06-21)|
|Re: Are lalr parse-tables viable? email@example.com (craig) (2004-06-25)|
|Re: Are lalr parse-tables viable? cfc@shell01.TheWorld.com (Chris F Clark) (2004-06-28)|
|From:||firstname.lastname@example.org (Torben Ęgidius Mogensen)|
|Date:||12 Jun 2004 16:18:15 -0400|
|Organization:||Department of Computer Science, University of Copenhagen|
|Posted-Date:||12 Jun 2004 16:18:15 EDT|
"craig" <email@example.com> writes:
> I am building a interpretor for a Basic-type language, and require it be
> able to load one of a number of "dialects" of the language. (Eg. English,
> German, Spanish, etc.)
> I have decided to seperate the parse-table "data" (ie. states, transitions,
> interpretor-code, etc.) for each dialect from the actual parser/interpretor
> itself - but when I generate the states and transition data from my EBNF
> (using a homebrew analyser) it finds an enormous number of states and
> Like in the order of 327 unique states and 16258 transitions between
> these. (It is a little less if I use the non-LR-free version, but then I get
> loads of reduce-reduce conflicts).
> I find this alarmingly huge!
This sounds about right for a complete language. The fairly small
languages I have used in my compiler class in the last few last years
have on average around 90 states in the parse tables.
LR parse tables do tend to be rather large, so some parser generators
use various tricks to compress the table -- the number of states and
transitions are not changed, but the data structure used to represent
them is smaller than the obvious two-dimensional table will be. I
don't know which (if any) methods are used for the parser generator
you use, though.
There are a couple of things you can do yourself to reduce the number
of states and transitions:
- Use operator precedence declarations to resolve ambiguities
(instead of rewriting the grammar) whenever this is possible.
- If you have several operators with the same precedence, combine
these to a single lexer symbol that has the actual operator
identity as an attribute (similar to what you do with variables and
Also, if the different dialects of your language only differ in the
names of keywords, the allowed letters in variable names and similar
lexical details, you need only change the lexer data for each dialect
while sharing a single parse table.
Return to the
Search the comp.compilers archives again.