Partially handwritten parsers

"Lowell Thomas" <lowell@coasttocoastresearch.com>
Sat, 30 May 2009 12:41:45 -0400

          From comp.compilers

Related articles
Partially handwritten parsers lowell@coasttocoastresearch.com (Lowell Thomas) (2009-05-30)
RE: Partially handwritten parsers quinn_jackson2004@yahoo.ca (Quinn Tyler Jackson) (2009-06-01)
| List of all articles for this month |

From: "Lowell Thomas" <lowell@coasttocoastresearch.com>
Newsgroups: comp.compilers
Date: Sat, 30 May 2009 12:41:45 -0400
Organization: Compilers Central
Keywords: parse, question, comment
Posted-Date: 31 May 2009 11:54:20 EDT

I've read many of the previous comments here about handwritten parsers and
it's a "mixed bag" to quote a couple of posts. The chief criticism seems to
be that you quickly lose sight of exactly what language it is you are
parsing. But many agree that speed favors hand writing. In the tests that I
have run on a**nb**n I see factors of 50-60 speed improvement with the
handwritten version. By itself this isn't very interesting. But what I do
find interesting is that many large grammars are full of phrases that, like
a**nb**n, are simple to parse by hand while still keeping them true to the
CFG defining them--alphanum, utf-8, names and numbers of all shapes and
sizes, etc. So what I am experimenting with is generating parsers where the
normal, recursive-descent procedure at any given syntax tree node can, at
the option of the developer, be replaced with a hand-written function. I
have tests where hand writing only a dozen or so simple phrases in grammars
with 300+ production names has resulted in speed improvement factors of 7-8.


I would be very much interested in your opinions and experiences--
especially any parser generators or references with similar approaches.


Speed was the main reason for looking into this. But as an added
benefit it provides a solution to another sticky little problem. I
like PEG's disambiguating "prioritized choice" rule for a variety of
reasons. But occasionally it leads to a perfectly valid CFG phrase
that can't be parsed with the prioritized choice rule. Hand writing a
parser for this problematic phrase can sneak you past it without a lot
of grammar bloat or other Rube Goldberg gymnastics.


Thanks.
Lowell Thomas
www.coasttocoastresearch.com
lowell@coasttocoastresearch.com
[How often is the parser a significant part of a program's runtime? The
scanner is usually much slower. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.