Mon, 31 Oct 1994 22:32:43 GMT

Related articles |
---|

Q: Recursive Descent w/Backtracking SCHMIDTG@iccgcc.cs.hh.ab.com (1994-10-21) |

Re: Recursive descent parsing / backtracking parrt@everest.ee.umn.edu (Terence Parr) (1994-10-31) |

Newsgroups: | comp.compilers |

From: | Terence Parr <parrt@everest.ee.umn.edu> |

Keywords: | parse, LL(1) |

Organization: | Compilers Central |

References: | 94-10-151 |

Date: | Mon, 31 Oct 1994 22:32:43 GMT |

Greg Schmid writes about backtracking:

*> S -> cAd*

*> A -> a | ab*

*>*

*> It would be able to correctly parse the sentences of this language.*

*>*

*> One pitfall with RDB is that if not implemented properly, the order of*

*> productions can affect whether or not a solution is found. For example,*

*> consider the input string w = cabd with the above grammar. A naive*

*> implementation might fail as follows:*

*>*

*> S -> cAd -> cad (REJECT)*

*> [...]*

*> This type of backtracking over a tree is difficult to implement using*

*> recursive precedures to represent each production, since if we had a*

*> procedure called 'A' which attempts to match either 'a' or 'ab', we*

*> would have to remember that we called 'A' and then call 'A' again*

*> telling it this time to ignore the first alternative.*

Greg, don't worry about this contrived case. Furthermore, backtracking is

needed in only a few cases and mainly for nasty languages like C++. Unless

you have a totally wacko language to parse, don't use a purely backtracking

parser as it has potentially exponential time complexity. If you do want a

purely backtracking parser, there are boatloads out there and you could

write a simple one in a day or two. You could also try a Tomita parser that

makes forests of bottom-up parse trees and then picks the "best" one.

Alternatively, let me make a shameless plug for ANTLR (the parser generator

in PCCTS). ANTLR would handle this one without backtracking because it's

simply LL(2). That is, in rule A, the parser would see input "ab" or "ad"

rather than just input "a"--which would allow it to correctly

(deterministically) predict A->a or A->ab. ANTLR will do backtracking also

if you want, however, it still won't solve your specific example without a

grammar rewrite. Even if we restrict ANTLR to LL(1), it could still solve

your example with a "syntactic predicate" and a rule instantiation:

s : C

( (A D)?

| A B D

)

;

The (...)? is a predicate that indicates you are not sure if the enclosed

construct will match. If it does not, simply try the next viable

alternative. The (A D|A B D) is a simple EBNF subrule (rule without a

label). This selective backtracking is extremely useful in practice; just

ask the NeXT folks doing the combined C/Objective-C/C++ parser with ANTLR.

The PCCTS ftp site is everest.ee.umn.edu in pub/pccts. Our newsgroup

is comp.compilers.tools.pccts.

Best regards,

Terence Parr

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.