Re: Has anyone hand-written a scanner/parser module?

Stephen Horne <sh006d3592@blueyonder.co.uk>
Tue, 18 Nov 2008 05:36:07 +0000

          From comp.compilers

Related articles
[9 earlier articles]
Re: Has anyone hand-written a scanner/parser module? rajamukherji@gmail.com (Raja Mukherji) (2008-11-16)
Re: Has anyone hand-written a scanner/parser module? bill@qswtools.com (Bill Cox) (2008-11-16)
Re: Has anyone hand-written a scanner/parser module? marcov@stack.nl (Marco van de Voort) (2008-11-17)
Re: Has anyone hand-written a scanner/parser module? dmaze@mit.edu (David Z Maze) (2008-11-17)
Re: Has anyone hand-written a scanner/parser module? gene.ressler@gmail.com (Gene) (2008-11-17)
Re: Has anyone hand-written a scanner/parser module? arnold@skeeve.com (2008-11-18)
Re: Has anyone hand-written a scanner/parser module? sh006d3592@blueyonder.co.uk (Stephen Horne) (2008-11-18)
Re: Has anyone hand-written a scanner/parser module? charlesb.cca@mpowercom.net (Charles E. Bortle, Jr.) (2008-11-18)
Re: Has anyone hand-written a scanner/parser module? r3jjs@yahoo.com (Jeremy J Starcher) (2008-11-19)
Re: Has anyone hand-written a scanner/parser module? armelasselin@hotmail.com (Armel) (2008-11-19)
Re: Has anyone hand-written a scanner/parser module? bobduff@shell01.TheWorld.com (Robert A Duff) (2008-11-23)
Re: Has anyone hand-written a scanner/parser module? bobduff@shell01.TheWorld.com (Robert A Duff) (2008-11-23)
Re: Has anyone hand-written a scanner/parser module? charlesb.cca@mpowercom.net (Charles E. Bortle, Jr.) (2008-11-24)
[4 later articles]
| List of all articles for this month |

From: Stephen Horne <sh006d3592@blueyonder.co.uk>
Newsgroups: comp.compilers
Date: Tue, 18 Nov 2008 05:36:07 +0000
Organization: virginmedia.com
References: 08-11-061
Keywords: parse, tools
Posted-Date: 18 Nov 2008 19:10:59 EST

On Sat, 15 Nov 2008 09:49:38 -0800 (PST),
"tuxisthebirdforme@gmail.com" <tuxisthebirdforme@gmail.com> wrote:


>I know most people anymore use lex/yacc or some derivative of these
>tools to create scanner/parser modules for their compiler projects. I
>was wondering if anyone has developed a scanner or parser that they
>personally hand-wrote? If so, I would like to know what language you
>used and what type of grammar you parsed.


I have written code for deriving FSMs from regular grammars, and I
have written code to derive an LR(1) PDA model from a grammar. The
former gets a little use. The latter is really just a
prototype/learning exercise thing.


Neither has been developed into a full lex/yacc-like tool. The FSMs
from regular grammars stuff is used in an incomplete DSL, but one
which doesn't do lexical analysis - at this point all I've really got
from it are a bunch of GraphViz dot files.


These are all part of a C++ digraph handling library.


For the FSMs from regular grammars, I use expressions to build complex
FSMs from simpler FSMs, exploiting methods for eliminating epsilon
transitions and minimizing nondeterminism, and for state model
minimisation. The intermediate representation is an AST, at least for
the current application. I use a home-grown tool similar to treecc to
generate AST node structs and multiple-dispatch operation functions.
However, the AST execution is external to the FSM handling class. The
FSM class provides methods for combining and manipulating FSMs, but
that's driven directly by the AST traversal code.


For the LR(1) stuff, I never actually got around to adding any kind of
front end, or any back end either. Just the internal representations
of the grammar and PDA model, and the code to generate the latter from
the former. The grammar uses a digraph - actually a tree - to
effectively define the set of BNF rules for each nonterminal. Both the
grammar and the parser models are based on the same digraph class.


The code for both is a bit of a mess because of the way my digraph
library evolved over time, and because the inheritence layering led to
some data-unhiding hassles as new functionality was added, especially
for the regular grammar stuff.


From this experience, I'd say that building an inheritance hierarchy
for digraphs is something to avoid, at least on agile programming
grounds. In fact writing your own digraph code may be a mistake in
itself. Beyond that, there's little to say.


I once tried to write an LR(1) parser generator in Python, but I never
could get it to work properly. Not Pythons fault - more a case of
premature optimisation.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.