Related articles |
---|
Backtracking parsers michael.finney@acm.org (Michael Lee Finney) (2000-05-30) |
Re: Backtracking parsers gneuner@dyn.com (2000-05-31) |
Re: Backtracking parsers adrian@dcs.rhbnc.ac.uk (2000-06-01) |
Re: Backtracking parsers michael.finney@acm.org (Michael Lee Finney) (2000-06-01) |
Re: Backtracking parsers gneuner@dyn.com (2000-06-06) |
Re: Backtracking parsers michael.finney@acm.org (Michael Lee Finney) (2000-06-09) |
Re: Backtracking parsers michael.finney@acm.org (Michael Lee Finney) (2000-06-10) |
Re: Backtracking parsers cfc@world.std.com (Chris F Clark) (2000-06-14) |
Re: Backtracking parsers gneuner@dyn.com (2000-06-20) |
Re: Backtracking parsers michael.finney@acm.org (Michael Lee Finney) (2000-06-20) |
[2 later articles] |
From: | Michael Lee Finney <michael.finney@acm.org> |
Newsgroups: | comp.compilers |
Date: | 1 Jun 2000 18:05:50 -0400 |
Organization: | Compilers Central |
References: | 00-05-111 00-05-119 |
Keywords: | parse |
gneuner@dyn.com says...
> > prod1 := (prod2 | prod3) prod4
> > prod2 := ...
> > prod3 := ...
> > prod4 := ...
> >If you are parsing prod1, have recognized prod2 and then fail to
> >recognize prod4, the usual method of applying backtracking would be to
> >discard the prod2 match and attempt to match prod3 followed by prod4.
> >As far as I know this is the way that backtracking is always done.
> >However, he suggested that what should happen is that, supposing prod3
> >contained alternatives, backtracking would occur at the bottom most,
> >right portion of prod3 with successive backtracks proceeding up the
> >parse tree for prod3.
> No insight, but an observation.
> If prod2 is itself a prefix to one of prod3's alternatives, it would
> make some sense to avoid reevaluating it when possible, but your
> example didn't specify this.
> Aside from the prefix case, both branches of
> prod1 := (prod2 prod4) | (prod3 prod4)
> would have to have been [at least] partially evaluated so that there
> is some prod3 state to be continued
Quite true. However, that is primary an efficiency issue. The issue
is that the two different methods can result in the recognition of
different languages.
The usual method appears to be based on the assumption that the
grammar defines a context free language where backtracking by entirely
discarding prod2 is exactly appropriate. The alternate approach would
be context sensitive, but not in the usual sense according to the
language classifications. Even for context sensitive grammars,
backtracking can be done in the usual manner, the only difference is
how productions are selected for replacement. There doesn't seem to
be much about the most general form of grammar which has no
restrictions on rewriting productions. Perhaps somebody knows of a
good reference that breaks that down a bit (outside of the usual
transformational grammars, attributed grammars, etc.)?
A different way of looking at it is the difference between open and
closed productions. The usual method would result from assuming that
each production is completely independent and completely encapsulated.
The alternate method would result from assuming that (at least some)
productions are macros or inline. That is actually a fairly
reasonable thing to do when you want to break a production up because
of its length, but then it could fail to recognize the grammar if you
break it up using the usual, closed productions.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.