|Partially handwritten parsers email@example.com (Lowell Thomas) (2009-05-30)|
|RE: Partially handwritten parsers firstname.lastname@example.org (Quinn Tyler Jackson) (2009-06-01)|
|From:||"Lowell Thomas" <email@example.com>|
|Date:||Sat, 30 May 2009 12:41:45 -0400|
|Keywords:||parse, question, comment|
|Posted-Date:||31 May 2009 11:54:20 EDT|
I've read many of the previous comments here about handwritten parsers and
it's a "mixed bag" to quote a couple of posts. The chief criticism seems to
be that you quickly lose sight of exactly what language it is you are
parsing. But many agree that speed favors hand writing. In the tests that I
have run on a**nb**n I see factors of 50-60 speed improvement with the
handwritten version. By itself this isn't very interesting. But what I do
find interesting is that many large grammars are full of phrases that, like
a**nb**n, are simple to parse by hand while still keeping them true to the
CFG defining them--alphanum, utf-8, names and numbers of all shapes and
sizes, etc. So what I am experimenting with is generating parsers where the
normal, recursive-descent procedure at any given syntax tree node can, at
the option of the developer, be replaced with a hand-written function. I
have tests where hand writing only a dozen or so simple phrases in grammars
with 300+ production names has resulted in speed improvement factors of 7-8.
I would be very much interested in your opinions and experiences--
especially any parser generators or references with similar approaches.
Speed was the main reason for looking into this. But as an added
benefit it provides a solution to another sticky little problem. I
like PEG's disambiguating "prioritized choice" rule for a variety of
reasons. But occasionally it leads to a perfectly valid CFG phrase
that can't be parsed with the prioritized choice rule. Hand writing a
parser for this problematic phrase can sneak you past it without a lot
of grammar bloat or other Rube Goldberg gymnastics.
[How often is the parser a significant part of a program's runtime? The
scanner is usually much slower. -John]
Return to the
Search the comp.compilers archives again.