RE: Packrat parsers and memoization

"Quinn Tyler Jackson" <quinn_jackson2004@yahoo.ca>
Mon, 8 Jun 2009 07:36:45 -0700

          From comp.compilers

Related articles
Packrat parsers and memoization kay@fiber-space.de (Kay Schluehr) (2009-06-07)
RE: Packrat parsers and memoization quinn_jackson2004@yahoo.ca (Quinn Tyler Jackson) (2009-06-08)
RE: Packrat parsers and memoization quinn_jackson2004@yahoo.ca (Quinn Tyler Jackson) (2009-06-11)
| List of all articles for this month |
From: "Quinn Tyler Jackson" <quinn_jackson2004@yahoo.ca>
Newsgroups: comp.compilers
Date: Mon, 8 Jun 2009 07:36:45 -0700
Organization: Compilers Central
References: 09-06-029
Keywords: parse
Posted-Date: 09 Jun 2009 05:31:55 EDT

Kay Schluehr said:


> I have a question concerning packrat parsing and memoization. Suppose
> one has following grammar
>
> G -> A1 b / A2 c
> A1 -> a
> A2 -> a
>
> and an input string 'a c'. Since the rule on the left of the ordered
> choice fails to parse the string the second one is tried. Now the
> major claim is that packrat parsing avoids exponential complexity in
> backtracking by means of memoization of intermediate results. However
> I fail to see how this helps here


Human beings are amazing producers of pathological counter-examples and
worst-cases. In the above example, "a" will indeed be examined in the input
twice by a top-down backtracking parser, whether or not memoization is used,
assuming no other optimizations.*


Now, consider this variant (which may or may not be in Packrat format):


S -> A3 d / A4 e
G -> A1 b / A2 c
A1 -> a
A2 -> b
A3 -> H
A4 -> H
H -> G+


Given the input:


    acacacacacacacacace


If you double the length of the string, making it:


    acacacacacacacacacacacacacacacacacace


will the time taken to parse more than double?


It has been my experience with memoization that grammars written by humans
for top-down parsing tend to be filled with constructions that result in
linear or near-linear performance. In many cases with the more complex
grammars (such as NLP), turning memoization off results in egregious
non-linear performance.




--
Quinn Tyler Jackson
Chief Scientist,
Thothic Technology Partners, LLC




* Memoization can be optimized beyond the trivial.



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.