Re: How to get this to an lalr(1) grammar?

pjj@cs.man.ac.uk (Pete Jinks)
Sun, 28 Aug 1994 18:08:41 GMT

          From comp.compilers

Related articles
How to get this to an lalr(1) grammar? Mark_Tarrabain@mindlink.bc.ca (1994-08-22)
Re: How to get this to an lalr(1) grammar? jholland@world.std.com (1994-08-22)
Re: How to get this to an lalr(1) grammar? pjj@cs.man.ac.uk (1994-08-28)
How to get this to an lalr(1) grammar? mph@zdenka.demon.co.uk (1994-08-28)
Re: How to get this to an lalr(1) grammar? mickunas@mickunas.cs.uiuc.edu (1994-09-10)
Re: How to get this to an lalr(1) grammar? dekker@dutiag.twi.tudelft.nl (1994-09-14)
Re: How to get this to an lalr(1) grammar? mark@omnifest.uwm.edu (Mark Hopkins) (1994-10-09)
| List of all articles for this month |
Newsgroups: comp.compilers
From: pjj@cs.man.ac.uk (Pete Jinks)
Keywords: parse, LALR
Organization: Dept of Computer Science, University of Manchester, U.K.
References: 94-08-137
Date: Sun, 28 Aug 1994 18:08:41 GMT

Mark Tarrabain <Mark_Tarrabain@mindlink.bc.ca> wrote:
>I've been trying to figure out how to make a certain grammar
>lalr(1). Or at least create an equivalent lalr(1) grammar that parses the
>same language.


I am trying to make sure I understand this so I can lecture about it - any
suggestions very welcome. I've rewritten the grammar as yacc:


%token V A T D L N B C P
%%
s : c | s t c
c : V n p
a : A | C o
o : | A
t : T | D | a u
u : | T
n : | L b | q
q : N | q a N
b : | B q
p : | P N


>The language and grammar are very evidently LR(2). If my understanding is
>correct, it should be possible to simplify it to LR(1), perhaps even
>LALR(1).


The grammar has to be restructured, because an "a" can occur either:
1) after a "q" as part of a "q" (q = q a N), or
2) after a "q" which is at the end of a "c", at the front of a "t" after a
      recursion in the rule for "s" (s = c | s t c).


As I have no insight into what your grammar is really about (something to do
with parsing natural languages? - I would like to know so I can use it as an
example) I have done it the hard way:
1) rewrite the rule for "s" using right recursion (worrying, but I can't see
what else to do) so that the different "a"s are adjacent:
s : c | c t s
2) expand the rules so that the "q"s and "a"s are in the same grammar rule,
so that we are not having to make a premature decision about whether or not
the "a" is part of the "q":
s : c | ct s
c : c1 c2
ct: c1 t1 | c1 t2 | c2 t1
| V L B q a | V q a | V L B q a T | V q a T /* c2 t2 */
c1: V | V L | V P N | V L P N | V L B q P N | V q P N
c2: V L B q | V q
/* t : t1 | t2 */
t1: T | D
t2: a | a T
c2 is the part of c that ends in a "q", and t2 is the part of t that starts
in an "a". You may want to re-factorise some of the rules.


These rules don't have to be changed:
q : N | q a N
a : A | C o
o : | A


I don't know the details of the theorem that says any LR(n) grammar can be
written as LR(1), but I suspect it only says that it is possible, not that
it is easy or sensible etc. - it is particularly upsetting that actions such
as parse-tree building get smeared all over the grammar as you expend rules.


Does anyone understand how this theorem interacts with another one, that
says that any grammar that uses inherited attributes can be rewritten to
only need synthesized atributes? Will it still be LR(1) or LALR(1)?
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.