|type definitions email@example.com (Justin Hibbits) (2003-05-18)|
|Re: type definitions firstname.lastname@example.org (2003-05-23)|
|Re: type definitions email@example.com (Justin Hibbits) (2003-05-24)|
|Re: type definitions firstname.lastname@example.org (Chris F Clark) (2003-05-29)|
|From:||Chris F Clark <email@example.com>|
|Date:||29 May 2003 03:04:29 -0400|
|Organization:||The World Public Access UNIX, Brookline, MA|
|References:||03-05-151 03-05-171 03-05-193|
|Posted-Date:||29 May 2003 03:04:29 EDT|
Justin Hibbits wrote:
> I'm working on a compiler library, and extensions to yacc to allow
> people to easily write language definitions, and build compilers with
> very little C code in the definition/grammar. But, I'm stuck, and
> can't figure out how to cleanly represent type definitions using a
> yacc-like syntax, so that one only has to write "set_type( TYPE_NAME )"
> in order to set the type, and perform sanity checking (making sure
> types don't conflict, types aren't specified more than allowed, etc).
In my opinion, the "answer" to type problems lies in "unification
algorithms". The tool which best handled that problem was called PSG
(if I recall correctly). One of the authors was Georg Snellting
In any case, what one does is:
At the leaves (i.e. where primitive types can apply) give each leaf a
type that is a union of the possible types that it may be. Some
literals are of only one type (e.g. 1.0e0 may only be a "real" value),
others may be one of a series of types (0 is an integer or "null" for
any type of pointer).
At the branches apply constraints to the types. For example, it is
not possible to assign a pointer value to a real variable. Thus, in a
"real" assignment context, the leaf 0 cannot be a pointer value.
These contraints get "unified" (i.e one finds the intersection of
allowed types) as one proceeds to the root.
Note that some expressions convert one type into another. For
example, the pointer dereference operator converts a pointer type into
the type pointed to.
By the way, the unification operation can be applied working down the
tree too and in some cases that is the appropriate way to progress.
Now, some of you will recognize that this algorithm exactly maps on to
what one can do with attribute grammars. That is precisely correct.
Type is an attribute of the AST being parsed and for each parsing rule
there is an equation that relates the types of the operands to the
type of the result. Some of these equations are inherited and some
are synthesized (and some are mixed). The exact equations depend on
the semantics of the language, of course.
BTW, when I met Georg and asked him about PSG. He pointed out how the
above fact and that unification was exactly equivalent to functional
programming, so that any compiler compiler that implements a
functional programming style of attribute grammar can solve the
Of course, you are probably hoping to write something simpler (than
either of the above possibilities). You can, but only by disallowing
the semantics of some languages. And, that may be okay. A simpler
model that only allows some typing systems to be created may be enough
easier to use that it makes up for its limitations. The actual number
of different typing systems used in programming languages is really
quite small. It is likely (in my opinion) that there are simple ways
of representing large subsets of those choices in concise and easy to
understand forms. All you have to do is find one of them....
Hope this helps,
Chris Clark Internet : firstname.lastname@example.org
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
19 Bronte Way #33M voice : (508) 435-5016
Marlboro, MA 01752 USA fax : (508) 251-2347 (24 hours)
Return to the
Search the comp.compilers archives again.