Re: OOP Parse tree generation?

"Quinn Tyler Jackson" <>
21 Mar 2000 23:40:54 -0500

          From comp.compilers

Related articles
OOP Parse tree generation? (Nicolas) (2000-03-11)
Re: OOP Parse tree generation? (Quinn Tyler Jackson) (2000-03-21)
Re: OOP Parse tree generation? (Floris 'Tamama' van Gog) (2000-03-23)
Re: OOP Parse tree generation? (Tom Payne) (2000-03-23)
Re: OOP Parse tree generation? (Nicolas Ojeda Bar) (2000-03-23)
Re: OOP Parse tree generation? (2000-03-23)
Re: OOP Parse tree generation? (George Nelan) (2000-03-23)
Re: OOP Parse tree generation? (Daniel C. Wang) (2000-03-25)
[2 later articles]
| List of all articles for this month |

From: "Quinn Tyler Jackson" <>
Newsgroups: comp.compilers
Date: 21 Mar 2000 23:40:54 -0500
Organization: Compilers Central
References: 00-03-074
Keywords: parse, OOP

Nicolas asked:

> How would you implement parse tree structures for a language that has
> statements and expressions (a la Pascal), in an object oriented fashion ?

> [Good question. I've never seen a very satisfactory OOP design for a
> parser. -John]

The biggest problem I have encountered with trying to convert the
artifacts of a parse into an object oriented mass is that of class
explosion. I considered, when implementing LPM, that each clause type
would make a nice class, and each class would have an overridden Match
member function, which would have allowed me to call Match through the
base Clause class, for proper dispatching.

I abandoned that for a switch(Clause.GetType()) statement because the
class explosion would have resulted in too many derived classes.

If you are trying to reduce a structured language to basic classes,
you are left with S-S-I (sequence, selection, and iteration) classes.
You can, through semantic aliases, translate all for statements into
while statements, all switch statements into a chain of if statements,
and so on, to reduce the number of classes required to represent a
program. This would reduce your need for classes, but would require
some serious work on your part to guarantee that your aliases were
semantically equivalent and did not have any side-effects. (q.v. John
C. Reynolds, _Theories of Programming Languages_, Cambridge University
Press (1998), pp. 500, ISBN: 0-521-59414-6, and its discussion on
properly turning "for" into "while" will give you some idea of what
would be involved in this.) One problem with approaching it this way
is that your parse tree would be lossy -- you would lose information
in your tree about what the programmer actually implemented -- all
iterations would become homogenized. Therefore, this approach may not
suit your needs. If you are implementing a pretty-printer or source
reformatter, for instance, this approach would commit the sin of
outputting formatted source code that was not what the programmer
originally implemented.

Another approach might be to reduce your class hierarchy to the major
contenders, such that iteration would be reduced to one class, but the
"while" and "for" information would be retained, so that the
"iteration" class would be a union of "while/for", with some of the
wasted overhead that unions carry with them. This is a compromise,
but is at least not lossy. This is the final design model I used for
LPM's clauses -- all of the numerous clause types are represented in a
single union object CLpmClause. It's as ugly (and spatially
inefficient) as sin -- but it kept the class explosion manageable, and
kept the speed reasonable.

So -- in short ... John's right. It's a good question, still looking
for a satisfactory answer.
Quinn Tyler Jackson

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.