Related articles |
---|
Q: writing YACC rules Bert.Aerts@esat.kuleuven.ac.be (Bert Aerts) (1998-08-24) |
Examples to illustrate problem Bert.Aerts@esat.kuleuven.ac.be (Bert Aerts) (1998-08-31) |
Re: Examples to illustrate problem cfc@world.std.com (Chris F Clark) (1998-09-05) |
From: | Bert Aerts <Bert.Aerts@esat.kuleuven.ac.be> |
Newsgroups: | comp.compilers |
Date: | 31 Aug 1998 12:20:41 -0400 |
Organization: | KUL |
References: | 98-08-178 |
Keywords: | yacc, parse, comment |
Before I give the example, thanks all for your help - unfortunately,
at this point it's been of little use to me: you all suggest I would
change my grammar - but that's something I can't control: the grammar
has been given by a cllegue I'm working for, and he is not eager to
change it for the sake of a tool. A principal point-of-view, and I get
stuck between the principle and the tool.
OK, here's an example.
My parser must be able to detect a listing of variables, and labels. In
case a line with variables is detected, all variable names are pushed
onto a list; if a label is found, the name is copied into a label
variable which will serve as key in a map later on:
a listing of variables is something like:
" a, b, c : model_test ( argument1 ) ;"
but it can also be :
" a : model_test ( argument1 ) ; "
A label stands at the beginning of a mathematical expression and is used
to identify operations :
"label : a = b + c ;
The YACC rules are :
var_list: TEXT
{ varVector.push ( $1 ) ; }
| TEXT COMMA var_list
{ varVector.push ( $1 ) ; }
| error SEMICOLON { yyerror ("Parse error in rule 'var_list' ! ") ; }
;
declaration: var_list COLON modelName SEMICOLON
{
...
}
label : TEXT COLON
{ labelVar = $1 ; }
| error NEWLINE { yyerror ("Parse error in rule 'label' !") ; }
;
full_expr : label math_expr
{
...
}
...
The problem that shows up here, is that if I enter a full expression (
e.g. "lab : a = b + c ; ") , the rule var_list get's reduced first, and
then I get a parse error on rule "declaration", because he didn't expect
to get an equal sign. Indeed, both rules require the same sequence of
tokens - untill the fourth token ( difference in rules: 1 rule will
require a closing bracket, the other one will require an equal sign ).
By then, he's too far to return. He should have looked two tokens
further to check.
Now, this is an example. I've worked my way around this one, so there's
no need for you to give me answers for this one: I've merely shown it to
you to illustrate the limitations of LALR I'm faced with. And there's
nothing I can change about the grammar: so my hopes rest on a way to
construct grammar rules.
Bert
--
mail : Bert Aerts
Katholieke Universiteit Leuven
ESAT-MICAS-group room 91.21
Kardinaal Mercierlaan 94
B-3000 Leuven, Belgium
phone : 0032-(0)16-32 10 76 - fax : 0032-(0)16-32 19 75
E-mail : Bert.Aerts@esat.kuleuven.ac.be
URL : http://www.esat.kuleuven.ac.be/~aerts
[My usual approach is to make the parser parse more than the actual
language, then throw out the wrong stuff semantically. In this case,
I'd have a list_of_names rule for the stuff before the colon, then
decide whether it's a label or a var_list later. LALR parsers do a swell
job on languages they can parse, but if you need unlimited lookahead,
you have to cheat. -John]
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.