Re: OOP Parse tree generation?

dara_gallagher@my-deja.com
23 Mar 2000 22:32:57 -0500

          From comp.compilers

Related articles
OOP Parse tree generation? nojeda@my-deja.com (Nicolas) (2000-03-11)
Re: OOP Parse tree generation? qjackson@wave.home.com (Quinn Tyler Jackson) (2000-03-21)
Re: OOP Parse tree generation? floris@vangog.net (Floris 'Tamama' van Gog) (2000-03-23)
Re: OOP Parse tree generation? thp@roam-thp2.cs.ucr.edu (Tom Payne) (2000-03-23)
Re: OOP Parse tree generation? lojedaortiz@interlink.com.ar (Nicolas Ojeda Bar) (2000-03-23)
Re: OOP Parse tree generation? dara_gallagher@my-deja.com (2000-03-23)
Re: OOP Parse tree generation? nelan@home.com (George Nelan) (2000-03-23)
Re: OOP Parse tree generation? danwang+news@cs.princeton.edu (Daniel C. Wang) (2000-03-25)
Re: Re: OOP Parse tree generation? egagnon@j-meg.com (Etienne M. Gagnon) (2000-03-28)
Re: OOP Parse tree generation? cfc@world.std.com (Chris F Clark) (2000-04-03)
Re: OOP Parse tree generation? cfc@world.std.com (Chris F Clark) (2000-04-03)
| List of all articles for this month |

From: dara_gallagher@my-deja.com
Newsgroups: comp.compilers
Date: 23 Mar 2000 22:32:57 -0500
Organization: Deja.com - Before you buy.
References: 00-03-074
Keywords: parse, OOP

"Nicolas" <nojeda@my-deja.com> wrote:
> How would you implement parse tree structures for a language that has
> statements and expressions (a la Pascal), in an object oriented fashion ?


As the moderator and another reply point out, this is a good question
for which there doesn't seem to be any obvious answer. I'll describe
the learning process I went through when recently confronted with this
problem and my attempt to solve it. I was working on a system which
had been written using typical OO style information hiding; i.e. all
of the logic was included in "node" classes which represented the
nodes of the parse tree. This had become extreemly messy. Certain
operations which can be expressed very cleanly in the
procedural/structural style ended up as a large set of operations
spread accross many classes which completely obfuscated the logic of
the operations. In addition certain operations (for example non-local
transformations of the tree) seemed almost inexpressable without
completely undermining the OO design.


However, the limitation which actually prompted the rethink, was that
supporting multiple back-end generators would not fit into the scheme
we were using.


I initially took a philosophical stance. Parse trees represent
explicit structure/data which is slightly incompatible with basic OO
techniques. Basically a representation should allow a separation
between this structure and the operations on the structure. My
initial idea was to use the visitor design pattern which is often
suggested as a means to decouple structure from logic in OOP. I began
to implement this idea but as the other poster pointed out, class
explosion becomes a problem; I "cheated" by expressing the AST using a
functional language called Gofer (similar to ML but side-effect free
and lazy) and hacking a Perl script to generate the classes for me. In
anycase I ended up with a class per AST node with constructors,
accessors and an appropriate acceptVisitor operation.


As I started rewriting the operations using the new style, a number of
negative aspects to this approach became apparent:


1. Writing a visitor class is almost indistiguishable from writing a
large switch statement.


2. We use Java which is dynamically typed. We found that more and more
we were relying heavily on dynamic typing. Our code was littered with
statements like "if (expr instanceof VarName)...". This became so
pervasive that we basically ended up adding predicate operations to
our base classes (for example "isVarName()").


3. While there are variations on the visitor pattern, the one we used
had limitations in that it was imposible to express in-place
transformation of the AST; we ended up adding another method to all
the nodes to support such transformations.


4. While the visitor pattern supposedly shields you from changes to
the structure (i.e. the AST), in our case the abstract syntax of the
language is fixed. I'd imagine that this is often the case so this
feature of the pattern is redundant. In any case I'm actually
sceptical about this. I think that if we extended our language, the
operations would also need rework.


Some positive aspects of the approach became apparent.


1. We found ourselves refactoring the OO heirarchy of the node classes
which led to maintanance gains. OO seems to encourage this more than
if we had been using traditional structural techniques.


2. We also found that this approach encouraged a clean separation
between basic semantic properties of the language and back-end
operations. For example type analysis logic was moved into the node
classes. Another example was that our "CaseStmtNode" had a method
called "getEquivalentIf()" which effectively expressed the semantics
of the language as well as being useful to the back-end code.


That's really all I can think of now; I hope it helps. The code has
better structure now but I'm still not completely happy for the
reasons outlined above.


Cheers,
Dara Gallagher.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.