Re: Looking for a parser

Chris F Clark <cfc@world.std.com>
21 Sep 2000 18:10:58 -0400

          From comp.compilers

Related articles
Looking for a parser sergey.z@sapiens.com (Sergey Zamansky) (2000-09-08)
Re: Looking for a parser vadim-cbl@siber.com (Vadim Maslov) (2000-09-13)
Re: Looking for a parser adrian@sartre.cs.rhbnc.ac.uk (A Johnstone) (2000-09-15)
Re: Looking for a parser idbaxter@semdesigns.com (Ira D. Baxter) (2000-09-17)
Re: Looking for a parser cfc@world.std.com (Chris F Clark) (2000-09-21)
Re: Looking for a parser idbaxter@semdesigns.com (Ira D. Baxter) (2000-09-23)
| List of all articles for this month |

From: Chris F Clark <cfc@world.std.com>
Newsgroups: comp.compilers
Date: 21 Sep 2000 18:10:58 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 00-09-026 00-09-105 00-09-128
Keywords: parse, tools

To give this note a little context, I have corresponded off the group
with the original requestor. The reason for the requestor wanting
backtracking is that the grammar they have generates conflicts when
run through Visual Parse++, so they would like a tool to eliminate
those conflicts.


Ira Baxter wrote:
> The DMS Reengineering Toolkit doesn't do backtracking.
> Rather, it forks multiple parsers at points of local ambiguities,
> (GLR parsing), acheiving the same effect more efficiently.
> See http://www.semdesigns.com/Products/DMS/DMSToolkit.html.


GLR parsing is (at least theoretically) more efficient than
backtracking and is essentially equivalent to Earley parsing, which
means it can handle any context free grammar. Those are its strongest
points.


Its weakest point is that GLR parsing is based upon LR parsing tables.
The conflicts in the LR tables are essentially the points where the
multiple parsers are forked. The conflicts haven't been eliminated
merely used as guide points for forking. VP++ also does (or at least
did at one time) generate GLR parsers under its "Natural Language
Parsing" option.


In theory, a GLR parser would not need to report the conflicts to the
user as the resulting GLR parser will resolve (if possible) the
conflicts at run (parser execution) time by parsing the two distinct
possibilities in parallel. (I do not know whether VP++ suppresses the
conflict reports or not.)


One of the caveats is the "if possible". If the grammar is ambiguous,
the GLR parser will create a parse forest (rather than parse tree) for
ambiguous inputs. A related problem occurs in unconstrained
backtracking parsers--although for certain ambiguous grammars an
unconstrained backtracking parser may loop rather generating a forest,
which is even worse behaviour.


There is an alternative that solves the ambiguous grammar problem,
predicated parsers. A predicated parser also backtracks, but only on
constrained subsets of the grammar (that the grammar author
specifically marks). A predicated parser does not generate a parse
forest if the grammar is ambiguous. The predicates declare which of
the ambiguous parses is the prefered one and a predicated parser
always selects that one, generating the desired parse tree.


A predicated parser resolves the conflicts at compile (parser
generation) time, by generating a parser that will backtrack over the
conflicts in the correct order to generate the desired parse. If the
grammar contains ambiguities that are not resolved by predicates, the
parser generator will still produce conflicts. However, if a properly
written predicated generator does produce conflicts for the grammar,
the grammar is unambiguous, specifying exactly one parse for each
legal input.


It is possible to combine the two technologies and produce a
predicated GLR parser generator. However, to my knowledge, no one is
currently working on such a thing. If someone is interested in
building such a tool and wants a starting point, contact me and
perhaps we can work something out. I know the steps required to take
Yacc++ in that direction, but don't have the free time to do it. I
also have other projects which are both more and less ambitious that I
would be willing to collaborate on.


-Chris


*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
3 Proctor Street voice : (508) 435-5016
Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.