Re: making a production quality parsing tool

"Quinn Tyler Jackson" <qjackson@wave.home.com>
24 Dec 1999 12:16:33 -0500

          From comp.compilers

Related articles
Looking for Compiler Design in C, by Holub lojedaortiz@interlink.com.ar (Nicolás) (1999-11-20)
Re: Looking for Compiler Design in C, by Holub lojedaortiz@interlink.com.ar (Nicolás) (1999-11-28)
Re: compiler books, was Looking for Compiler Design in C, by Holub compres@world.stc.com (Chris F Clark) (1999-11-29)
Re: Re: compiler books, was Looking for Compiler Design in C, by Holub qjackson@wave.home.com (Quinn Tyler Jackson) (1999-12-01)
Re: Re: compiler books, was Looking for Compiler Design in C, by Holub doug@netcom.com (1999-12-24)
Re: making a production quality parsing tool qjackson@wave.home.com (Quinn Tyler Jackson) (1999-12-24)
| List of all articles for this month |

From: "Quinn Tyler Jackson" <qjackson@wave.home.com>
Newsgroups: comp.compilers
Date: 24 Dec 1999 12:16:33 -0500
Organization: Compilers Central
References: 99-11-121 99-11-167 99-11-178 99-12-004 99-12-068
Keywords: parse, tools

> So you and Chris and presumably Terence all agree about this. But
> *why* is it so hard to bring a parser generator up to production quality?
>
> I presume you're talking about far more than the widely known
> issues of bringing e.g. data tables down to a reasonable size.
>
> What are the issues y'all have run into? Are any of them preventable
> in hindsight?


Well, speaking only for myself and PAISLEI, probably the biggest delay
issues have involved:


1. Example grammars.


A production quality parser generator must come with sufficient
examples that anyone wishing to use the system can look at the
examples and grok the system. It takes a while to dream up examples
that both demonstrate the features of the generator, and are useful.
Because it is a new tool, with a small user base, there are no
examples floating about unless I write them.


Example grammars are a parsing tool's primary lemmas, and lemmas take
time to develop and certify.


Preventable? I could have written a yacc work-alike and the example
grammars would abound.


2. Notation.


I have written grammars using PAISLEI much more quickly than with
other tools (2 hours versus 1 week), but only because I invented the
notation and am familiar with the nuances of the system. Everyone
expects tools to support the particular notation that they are most
familiar with, or some notation that may or may not be sufficient for
the expressiveness of the tool. LPM, the underlying pattern matching
system behind PAISLEI grammars, has a notation that was designed for
efficient lexing by a DFA, not for efficient understanding by humans.
This was necessary because LPM patterns and PAISLEI grammars, unlike
most generated grammars, are compiled at run-time, can be dynamically
recompiled, and this takes time.


Notation is a religious issue, and converting from one religion to
another takes time, tact, and diligence.


Preventable? See (1) above.


3. Experimental features.


PAISLEI grammars contain experimental features, such as adaptability,
permutation phrases, multistep matching, integrated symbol tables, and
so on -- and these features confuse those who are experienced with
traditional tools. It takes time to properly document these features
(both with examples and papers), so that the general user becomes
aware of these features and their uses. Until they do, PAISLEI
appears to be a notationally quirky LL(k) parser.


Preventable? I'm too adventurous to avoid experimental features.


4. Rigor.


Every new feature that I add to PAISLEI grammars either has to be
supported by the literature, or by expert opinion as to its
usefulness. (Which means my queries to experts asking for literature
leads have become an annoying regularity behind the scenes.) I have
some features in the system that are not yet documented, despite my
belief that they are useful, because sufficient support for their
usefulness and correctness has not yet been established. For
instance, LPM pattern adaptability has been in there almost since
LPM's beginnings in late 1993, but it was not until I reviewed
Burshteyn, Christiansen, Boullier, and Shutt's work on grammar
adaptability that I decided to admit to the feature being under the
hood.


Rigor takes time.


Preventable? I could document the features and hope for the best, I
suppose. Bad science, though.
--
Quinn Tyler Jackson
http://www.qtj.net/~quinn/


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.