Re: OOP Parse tree generation?

Chris F Clark <cfc@world.std.com>
3 Apr 2000 11:28:40 -0400

          From comp.compilers

Related articles
[2 earlier articles]
Re: OOP Parse tree generation? floris@vangog.net (Floris 'Tamama' van Gog) (2000-03-23)
Re: OOP Parse tree generation? thp@roam-thp2.cs.ucr.edu (Tom Payne) (2000-03-23)
Re: OOP Parse tree generation? lojedaortiz@interlink.com.ar (Nicolas Ojeda Bar) (2000-03-23)
Re: OOP Parse tree generation? dara_gallagher@my-deja.com (2000-03-23)
Re: OOP Parse tree generation? nelan@home.com (George Nelan) (2000-03-23)
Re: OOP Parse tree generation? danwang+news@cs.princeton.edu (Daniel C. Wang) (2000-03-25)
Re: OOP Parse tree generation? cfc@world.std.com (Chris F Clark) (2000-04-03)
Re: OOP Parse tree generation? cfc@world.std.com (Chris F Clark) (2000-04-03)
| List of all articles for this month |

From: Chris F Clark <cfc@world.std.com>
Newsgroups: comp.compilers
Date: 3 Apr 2000 11:28:40 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 00-03-074 00-03-108
Keywords: OOP, parse

Dara Gallagher wrote an interesting article in this thread where she said:


> I was working on a system which had been written using typical OO
> style information hiding; i.e. all of the logic was included in
> "node" classes which represented the nodes of the parse tree. This
> had become extreemly messy. Certain operations which can be
> expressed very cleanly in the procedural/structural style ended up
> as a large set of operations spread accross many classes which
> completely obfuscated the logic of the operations.
...
> As I started rewriting the operations using the new style, a number of
> negative aspects to this approach became apparent:
>
>1. Writing a visitor class is almost indistiguishable from writing a
> large switch statement.


Yes, the three implementations are equivalent. The operations either
all get spread out in the tree classes, of get spread out a little
less in the visitor classes, or get collected in a large switch
statement.


There are two solutions to the spreading out of functionality. Use
intermediate base classes to factor out common aspects if the
operations are spread out over tree classes. Use the "Indirect
Visitor Pattern" (from Jack Reeves (IIRC) article in either C++ Report
or Dr Dobbs Journal) if the operations are spread out over visitor
classes.


A short synopsis: In the indirect visitor class the actual visitor
functions are not tied to their tree classes. The connection between
the visitor class and the tree class is done by arrays of function
pointers (one array for each visitor set and one entry in each array
for each tree class). The array entry maps the tree class to the
visitor function that applies to it.


By the way, I always think of my virtual functions as repesenting
branches of a case statement. And while it is unfortunate that the
case statement is spread out over different classes (often in
different files). I console myself with the fact at runtime I am
avoiding doing the actual case switch--it is all implicit in the
virtual function call.


> 4. While the visitor pattern supposedly shields you from changes to
> the structure (i.e. the AST), in our case the abstract syntax of the
> language is fixed. . . . In any case I'm actually sceptical about this.


Actually, your skepticism is correct. The visitor pattern is designed
to isolate the tree from changes in the visitors (not vice versa). If
your tree structure is in the process of changing, visitor actually
imposes an overhead to those changes, since you have to update each
visitor class associated with each tree class the is changed. Note,
the indirect visitor pattern improves this situation slightly as the
visitor functions are not tightly bound to the tree classes.


A solution that solves both sets of problems is one where the parser
generator (or a connected tool) generates both the tree classes and
the visitors. Such a tool could allow the user to group the visitor
functions either with their tree classes or all together like branches
of a switch. I think some tools do that (or are close to doing that)
now.


> 3. While there are variations on the visitor pattern, the one we used
> had limitations in that it was imposible to express in-place
> transformation of the AST; we ended up adding another method to all
> the nodes to support such transformations.


Most visitor patterns use either a fixed traversal function or mix the
traversal with either the visitor functions or the tree classes. This
causes the problem that you note here. You need a different traversal
function if you want to modify the tree being traversed. Like many
problems the oo-solution is to separate the parts into different
classes. Divide the problem into a three part pattern--visitor,
iterator, and tree. The visitor class hierarchy includes the
functions doing the visiting. The iterator class hierarchy handles
traversing the tree. The tree class hierarchy defines the data being
operated upon. I list the parts of the pattern in that order, so that
they spell "vit" which is close to the romance language words for
speed (or life) which makes it mnemonic for me. When there is data
that needs to be propagated during the traversal, put it in a set of
"baggage" classes. That makes the mnemonic vit-b (like vitamin-b).


For those of you who might guess, that's the direction Yacc++ is
headed. It seems the next progression to the "Yacc meets C++" work
(from Steve Johnson's paper of the same name) we started in 2.0.


Hope this helps,
- -Chris


*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
3 Proctor Street voice : (508) 435-5016
Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.