defining unique symbols (Fergus Henderson)
13 Jul 1997 11:32:43 -0400

          From comp.compilers

Related articles
defining unique symbols (1997-07-13)
Re: defining unique symbols (1997-07-16)
Re: defining unique symbols (Jerry Leichter) (1997-07-16)
Re: defining unique symbols (Pieter Schoenmakers) (1997-07-16)
Re: defining unique symbols michael.ball@Sun.COM (MICHAEL BALL) (1997-07-22)
Re: defining unique symbols (1997-07-27)
| List of all articles for this month |

From: (Fergus Henderson)
Newsgroups: comp.compilers
Date: 13 Jul 1997 11:32:43 -0400
Organization: Comp Sci, University of Melbourne
Keywords: linker, design, question

I'm implementing a compiler for a language that has some run-time type
identification (RTTI) support. The type system for the language that
I'm implementing has type constructors t0, t1, t2, ... ad infinitum
(e.g. they could be the type constructors for tuples with 0, 1, 2, ...
arguments). To implement the RTTI support, I need to generate an
initialized constant structure called a `type_ctor_info' for each of
these type constructors.

OK, fine so far. Now the catch is that we would like to be able to
compare two different type_ctor_infos by simplying comparing their
addresses (rather than say doing a string comparison on their "name"
fields). In order to do that, we need to ensure that there is exactly
one definition of each type_ctor_info in the final executable. This
is a bit difficult to do in the presence of separate compilation,
because we want to be able to use the standard system linker. Oh, and
I forgot to mention that our compiler generates C code.

We can't define each type_ctor_info as `static' in the module in which
it is referenced, since then there might be two different copies of
the type_ctor_info for the same type constructor. We can't define
each one as `extern', since that would lead to multiply defined symbol

Now, I can see several ways of doing this:

- we could leave them undefined, parse the linker error messages
to figure out which ones we need, generate code defining just
those ones, and then reinvoke the linker to link in the newly
generated code;

- we put a fixed limit N on the arity of these type constructors,
and put global definitions of t0, t1, ..., tN in the standard

- we could combine the previous two methods (so that the linker
only needs to be reinvoked if the fixed limit is exceeded);
(disadvantages: parsing linker error messages is fragile and
non-portable; quite complex)

- we could store the information about which type constructors
are referenced by each module in a separate file, rather than
in the object file, and do our own pre-link pass on these

- we could use ELF weak aliases:

#define weak_alias(name1, name2) \
asm(".weak " #name2 "; " #name2 "=" #name1)

Each module `foo' that uses a given type constructor `ti'
would include a local definition for it `foo_ti', and
would also define `ti' as a weak alias for `foo_ti'.
With a bit of luck, the linker will pick exactly one of the
definitions of `ti' for us. (Would this work? I haven't tried it.)

Well, all of these methods have significant disadvantages.
Is there a method I missed?

I'm sure I'm not the first to solve this problem. I suppose it is
equivalent to a special case of C++ template instantiation.
How do people normally handle this sort of thing?

Fergus Henderson <>
WWW: <>
PGP: finger fjh@
[Linker development seems to have stopped around 1970, other than
perhaps some hacks for C++ mangled names. Personally, I'd do a
wrapper around the linker that listed the undefined symbols out of all
the modules to be linked and generated the necessary stub. Or on Unix
systems, define them as one-byte common blocks, and the linker will do
the right thing. The "pseudo registers" invented for the IBM
mainframe PL/I in the mid 1960s would solve this nicely, but nobody
else seems to have used them.
Plug: draft chapters of my linker book coming to a web site soon. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.