Re: GLR question

Joachim Durchholz <joachim_d@gmx.de>
24 Feb 2003 18:05:53 -0500

          From comp.compilers

Related articles
GLR question fgagnon@inouii.com (Francois Gagnon) (2003-02-13)
Re: GLR question haberg@math.su.se (2003-02-21)
Re: GLR question joachim_d@gmx.de (Joachim Durchholz) (2003-02-24)
| List of all articles for this month |

From: Joachim Durchholz <joachim_d@gmx.de>
Newsgroups: comp.compilers
Date: 24 Feb 2003 18:05:53 -0500
Organization: Compilers Central
References: 03-02-088
Keywords: parse, LR(1)
Posted-Date: 24 Feb 2003 18:05:53 EST

Francois Gagnon wrote:
> I just came across some comments about GLR grammar... I have heard
> of LL, LR, LARL and SLR grammars, but never GLR... Can anyone point me
> to a good book describing how GLR differs from the others?


Basically, GLR parsing is just like LR parsing. The parser generator
doesn't report conflicts, it keeps them in the tables, and if it hits
the conflict during parsing, it just keeps the two resulting trees and
continues on both of them (until one of them hits a syntax error, at
which point the tree is simply dropped).
For programming languages, a parse is successful if exactly one parse
tree results. Having no parse tree means a syntax error (not sure how
well the parser can pinpoint the point or error). Having multiple parse
trees means that the grammar is ambiguous, and that particular input
exhibited the ambiguity; depending on the language definition, this may
mean that the grammar is broken, or that the program is invalid, or that
all parses must be kept and processed further for (hopefully)
disambiguation downstream. (Keeping the ambiguities is what's desired in
natural-language processing, and I hear that GLR is quite common there.)


That's just a rough outline. Implementation of the general idea seems to
be mildly hairy, with inadvertent endless loops lurking in some corners
(I recall epsilon productions as being troublesome, there are also some
other issues). There's also the issue of how to organize the data
structures so that no nodes are created more than absolutely necessary
(one doesn't want to duplicate the entire parse tree whenever both
branches of an ambiguity must be followed in parallel).


The article that I read was: Elizabeth Scott, Adrian Johnstone, and
Shamsa Sadaf Hussain, Tomita-Style Generalised LR Parsers, Royal
Holloway, University of London, Department of Computer Science,
TR-00-12. Available online as
http://www.cs.rhul.ac.uk/research/languages/publications/tomita_style_1.ps
. It's a "we discuss the improvements of our approach over existing
ones" paper, so you'll read a lot about problems that others have
avoided for you... I know of no paper that just gives a tried-and-true
algorithm for GLR parsing (does anybody?).


HTH
Joachim


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.