Re: Examples to illustrate problem

Chris F Clark <cfc@world.std.com>
5 Sep 1998 01:13:35 -0400

          From comp.compilers

Related articles
Q: writing YACC rules Bert.Aerts@esat.kuleuven.ac.be (Bert Aerts) (1998-08-24)
Examples to illustrate problem Bert.Aerts@esat.kuleuven.ac.be (Bert Aerts) (1998-08-31)
Re: Examples to illustrate problem cfc@world.std.com (Chris F Clark) (1998-09-05)
| List of all articles for this month |

From: Chris F Clark <cfc@world.std.com>
Newsgroups: comp.compilers
Date: 5 Sep 1998 01:13:35 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 98-08-178 98-08-215
Keywords: yacc, parse

Bert Aerts wrote:
> Before I give the example, thanks all for your help - unfortunately,
> at this point it's been of little use to me: you all suggest I would
> change my grammar
                        ^^^^^^^
Hopefully, this is just a confusion in terminology. You do not need
to change the language (which is what I think you mean by grammar)
only the rules which implement the language (i.e. the input to yacc
which describes that language), which is what we mean by grammar.


If you really can't change your (colleague's) yacc rules (at all),
then life is going to be very difficult. If your restriction is that
you must accept the language your colleague has defined, but the yacc
rules which implement it can be tweaked, then changing the grammar
(the rules you feed to yacc to describe that language) is one
solution. It looks like you did that below.


> Now, this is an example. I've worked my way around this one, so there's
> no need for you to give me answers for this one: I've merely shown it to
> you to illustrate the limitations of LALR I'm faced with. And there's
> nothing I can change about the grammar: so my hopes rest on a way to
> construct grammar rules.


In any case, from the grammar snippet you enclosed, it looks like your
colleague has defined a language which requires more than one token of
lookahead to resolve. (By the way, the fact that the grammar needed
more than one token of lookahead should have forced yacc to give a
conflict message. That's what a conflict message is trying to tell
you, that yacc cannot construct a parser for this grammar given its 1
token of lookahead implementation.) There are several ways of
attacking this kind of problem.


1) Get a parser generator which has more than 1 token of lookahead.


2) Use syntactic predicates (or an equivalent feature) to get past
lookahead problems.


3) Relax your grammar so that it accepts some "illegal" programs that
you then filter out semantically. I suspect this is what you did
above. Actually you may not have to allow illegal programs, but you
may not be able to do a precise parse of the language (e.g. dis-
tinguishing which identifiers are really labels) until you have parsed
more context.


4) Do your multiple token lookahead in the lexer and create special
tokens which can guide your parser.


Presuming you want to (or must) continue using yacc (and that your
colleague's language really needs more than one token of lookahead),
choices 3 & 4 are your options. As far as I know there is no
cannonical collection of rules-of-thumb and hacks that implement
choices 3 & 4. However, questions on how to resolve specific grammar
problems have come up in the past and you could search back articles
for tricks and tips. I cannot promise that that search will be easy
or fruitful, as there are a lot of comp.compilers articles. {Okay,
I'm curious John, how many do you have archived?}


To use solution 1 or 2, you have to find a parser generator with those
features, but there are yacc-derivatives or yacc-like parsers in both
categories. BTW, neither 1 nor 2 is a complete panacea, as you still
may have problems that need to be resolved "the hard way", but you may
have fewer of them.


----------------------------------------------------------------------


Another interpretation of your original fail-safe question is that you
want something to browbeat your colleague with so that your colleague
will give you a reasonable yacc grammar to start with. In particular,
you want a set of rules that can guarantee that a grammar will not
have conflicts (aside from the run-it-through-yacc-and-find-out rule).


The answer to that question is yes-and-no. There are rules which can
guarantee that your grammar will not have conflicts. However, the
rules prevent your language from being able to express certain
constructs. For example, one set of rules would not allow you to
define a language where identifiers and labels can be intermixed the
way that your colleague's language does now. Without restricting the
expressibility of your language, there are no rules which guarantee
that the language is [LA]LR(1) except for sets of rules which are
equivalent to run it through yacc and if it gets no conflicts it is
okay. There are even proofs showing that such rules cannot exist.


Hope this helps,
-Chris


*****************************************************************************
Chris Clark Internet : cfc@world.std.com
Compiler Resources, Inc. CompuServe : 74252,1375
3 Proctor Street voice : (508) 435-5016
Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)
------------------------------------------------------------------------------
Web Site in Progress: Web Site : http://world.std.com/~compres
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.