Related articles |
---|
Implementing types in a compiler pjmlp@students.si.fct.unl.pt (1997-01-29) |
Re: Implementing types in a compiler pertsserver!mkgardne@uunet.uu.net (1997-01-30) |
Re: Implementing types in a compiler jlilley@empathy.com (John Lilley) (1997-01-30) |
From: | John Lilley <jlilley@empathy.com> |
Newsgroups: | comp.compilers |
Date: | 30 Jan 1997 22:21:36 -0500 |
Organization: | Nerds for Hire, Inc. |
References: | 97-01-238 |
Keywords: | OOP, types |
Paulo Jose Pinto - LEI wrote:
> more sense to me to create an abstract class Type and then use
> ProcedureType, ConstType and other for all kinds of information. After
> all when we have an union in C, that should be implement in C++ as an
> abstract class and all the cases as descendents of that class. But
> this requires some sort of dynamic_cast so that I know which member
> function I can call, after all doesn't make sense to use a member
> function GetParameter (....) in ConstType.
I'm posting as much to get other's responses as to offer an opinion.
In a C++ parser I am working on, I represent types like you suggest --
using a base class an then creating derived CPPType classes from it.
For various reasons my base class is not abstract (for example, the
base class contains the list of type-modifiers). From the CPPType
base class, I derive some others:
CPPTypeBuiltin -- int, long, char, wchar_t, etc
CPPTypeClass -- class types
CPPTypeEnum -- enums
CPPTypeModifier -- an intermediate base for type-modification
The CPPTypeModifier conceptually modifies another CPPType. There are
specific modifiers derived from it:
CPPTypeIndirection -- reference/pointer
CPPTypeArrayOf
CPPTypePointerToMember
CPPTypeFunctionReturning
CPPTypeFunctionArgument
... and so on ...
This approach is static and address-dependent (e.g., very early
binding). I assume and guarantee that there is no type aliasing. In
other words, if a CPPTypeIndirection describes a pointer to another
type, then it is the *only* CPPTypeIndirection in existence that
modifies that particular type in that particular way. This leads to
compact type representations, trie-fashion.
There are other approaches to this as well. On the other extreme is a
symbolic (very late binding) approach, in which all types are desribed
as LISPish trees of modifiers where the information is mostly symbolic
or name-oriented. This is very flexible, and minimizes the amount of
processing that happens in the parser (which is, in general, a good
thing).
I have found that for C++, the parser needs immediate access to all
the type and declaration information (to parse template
specializations), so I choose the early-binding approach to give me
easily-walked data structures. I wish someone would show me that the
C++ parser doesn't really need all the type information to parse
syntax, but no such advice has been forthcoming :(
I've pondered other, hybrid approaches, such as constructing a
symbolic AST first, and then when symbol table information is needed,
process the AST incrementally and add its information to the symbol
table. It can be argued that the symbol table is merely a cache of
the AST and is there solely for performance reasons. But it's a
fairly dramatic speedup; walking the AST is O(N) and the symbol table
is O(logN) or O(Constant).
I'd like to hear other opinions.
john lilley
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.