Re: Backtracking parsers

Chris F Clark <cfc@world.std.com>
14 Jun 2000 12:33:36 -0400

          From comp.compilers

Related articles
Backtracking parsers michael.finney@acm.org (Michael Lee Finney) (2000-05-30)
Re: Backtracking parsers gneuner@dyn.com (2000-05-31)
Re: Backtracking parsers adrian@dcs.rhbnc.ac.uk (2000-06-01)
Re: Backtracking parsers michael.finney@acm.org (Michael Lee Finney) (2000-06-01)
Re: Backtracking parsers gneuner@dyn.com (2000-06-06)
Re: Backtracking parsers michael.finney@acm.org (Michael Lee Finney) (2000-06-09)
Re: Backtracking parsers michael.finney@acm.org (Michael Lee Finney) (2000-06-10)
Re: Backtracking parsers cfc@world.std.com (Chris F Clark) (2000-06-14)
Re: Backtracking parsers gneuner@dyn.com (2000-06-20)
Re: Backtracking parsers michael.finney@acm.org (Michael Lee Finney) (2000-06-20)
Re: Backtracking parsers michael.finney@acm.org (Michael Lee Finney) (2000-06-20)
Re: Backtracking parsers michael.finney@acm.org (Michael Lee Finney) (2000-06-30)
| List of all articles for this month |

From: Chris F Clark <cfc@world.std.com>
Newsgroups: comp.compilers
Date: 14 Jun 2000 12:33:36 -0400
Organization: Compilers Central
References: 00-05-111 00-05-119 00-06-015 00-06-032 00-06-037
Keywords: parse

Michael Lee Finney <michael.finney@acm.org> wrote a fascinating article on
inline substitution, in which he talked about backtracking in the following
case:


> p1:= .*! '*' '/'


I'm not quite certain how your ! operator works. More examples might
help enlighten me. It looks like a very useful regexp extension. (Is
it somehow related to the prolog "cut" operator or something from
snobol or icon? Both my prolog and snobol are so rusty I can't
remember anything but the most general nature of them.)


However, there is another approach to controlling backtracking that
semes to cover the same ground as your inline macro substitution. It
is syntactic predicates.


With predicates, the grammar author explicitly specifies which
portions of the grammar the backtracking is to apply to. Predicate
rules are written in two parts, a test and a production to apply if
the test passes. In syntactic predicates, the test is a pattern that
if matched the production is then applied.


With predicates, one can recast your p1 rule as follows:


p1a:= ('*' '/') >> '*' '/'
        | . p1a ;


The first alternative of the p1a rule is a predicate that says if the
lookahead is a '*' and then a '/' then apply the rule '*' '/'. The
second alternative has no predicate and applies when the first rule
fails. It consumes 1 character and then recursively applies p1a to
the remaining string.


The key idea in predicates is that the grammar author specifies the
backtracking directions. Each test in a predicate represents a
backtracking directive to the parser. The parser attempts to match
the predicate test before continuing the parse down that alternative
of the rule and if the test doesn't match the parser does not continue
down that alternative, but attempts to use the next alternative
(checking its predicate test if there is one). Whether the test is
matched or not, the parser always backs up the input to where the test
was started before continuing the parse.


By the way, the test can include references to other rules (and may
even be recursive). One predicate example I have uses that aspect to
parse a**i b**i c**i (three matching delimiters), which is a case
known not to be LR(k) for any k.


The expansion properties of your p2 example don't effect predicates. A
predicate is the same whether it is matched with a recursive rule or
with a regular expression or any other equivalent form. However, you
can get a grammar that works like your p2 case, by simply shortening
the predicate.


p1b:= ('*') >> '*' '/'
        | . p1b ;


In this example, only the '*' is matched before choosing the first
alternative.


Note, the predicates discussed above work identically in each
implementation I am aware of (PCCTS, ANTLR, and Yacc++). There are
slight syntactic differences (PCCTS uses ? as the operator and ANTLR
uses -> if I recall correctly), and in more complex cases there are
features that are unique to the different implementations (although
generally, any problem can be dealt with in the common subset that all
three implementations accept).


We have been experimentally making Yacc++ support inline expansion of
rules. However, they don't effect predicate semantics. A predicate
means the same whether the non-terminals referenced in it are inline
expanded or not. But, they may have a related effect on conflicts
(i.e. expanding a rule inline can cause which conflicts are reported
to change).


-Chris


*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
3 Proctor Street voice : (508) 435-5016
Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.