Related articles |
---|
defining unique symbols fjh@mundook.cs.mu.OZ.AU (1997-07-13) |
Re: defining unique symbols mfinney@lynchburg.net (1997-07-16) |
Re: defining unique symbols leichter@smarts.com (Jerry Leichter) (1997-07-16) |
Re: defining unique symbols tiggr@ics.ele.tue.nl (Pieter Schoenmakers) (1997-07-16) |
Re: defining unique symbols michael.ball@Sun.COM (MICHAEL BALL) (1997-07-22) |
Re: defining unique symbols hrubin@stat.purdue.edu (1997-07-27) |
From: | Pieter Schoenmakers <tiggr@ics.ele.tue.nl> |
Newsgroups: | comp.compilers |
Date: | 16 Jul 1997 23:00:42 -0400 |
Organization: | Compilers Central |
References: | 97-07-052 |
Keywords: | OOP, linker |
This problem is identical to that of uniquifying Objective-C selectors
(message identifiers). Several considerations are noteworthy in this
context:
- If you depend on the linker to unique the descriptors, you also depend
on the dynamic linker/loader to do the same, unless you know absolutely
surely that programs written in the language will never want to employ
dynamic code loading (which you can't know).
Extending the linker could be easy, for instance starting with the GNU
linker, since that that is fairly portable. Dynamic loading however is
a rather hairy: very machine dependent and highly unportable.
An example of this approach is used by the Objective-C runtime library
as implemented by NeXT, for their nextstep/openstep (and now rhapsody)
operating environments. The linker is a modified Mach-O linker which
has notion of an __OBJC segment, with various special section types.
One of these is used to store the selector strings, which are uniqued
by the linker. The dynamic linker provides similar functionality. Net
result is that selectors are equal if and only if they have the same
address, even in the context of dynamic loading.
- An approach used by the GNU Objective-C runtime library is to not
unique the selectors, but an identifying number inside it: the compiler
uniques the selectors per module, i.e. object file. Every object file
comes with a (compiler-generated) constructor (thanks C++ for getting
this time consumer into all linkers); this constructor registers the
module with the runtime library. This runtime assigns each unique
selector a unique number, and selectors are now equal if
sel_a->unique_id == sel_b->unique_id
When dynamic loading, the constructors of the new modules are invoked
and the modules/selectors are added to the runtime information just
like with the main-program selector information.
This system is only two memory references slower than address
equivalence testing. An advantage (in the context of selectors) is
that all selectors have a closed naming 0, 1, ...
- If the language will end up having a considerable standard library, you
will want to make that a shared library or shared object. Unless a
machine's shared library implementation is braindead, you'll have to
take new versions of these shared libraries into account. This problem
can be similar to that of dynamic loading, or simpler.
- You do not want to have to do too much in the linking phase.
Even the tiniest program suffers from long linking times if many (or)
large libraries have to be scrutinized. This increases, undesirably,
the edit-compile-link-run-crash cycle time
- You do not want to have to do too much at run time.
For (very) small programs (`ls' is my favourite example when I think of
small programs: it must just list files; it must not take any time) the
startup time of a (GNU Objective-C) runtime library is significant, and
startup times in the range of centiseconds (or beyond) are unacceptable
for small programs.
With TOM, an OO language I am developing, a program called the
resolver is run before the linker. It creates (and uniques, where
applicable) the selectors, argument and return types descriptions,
string constants, class descriptions, class hierarchy and method
dispatch structures, etc. The input to the resolver consists of the
`information' files generated by the compiler for each object file.
When run for static resolution, the resolver does everything necessary
for the program to run, i.e. all descriptions and structures needed by
the runtime library are generated by the resolver. When run for
dynamic resolution (this includes dynamic loading) the resolver only
generates the descriptions; the actual structures are built (partly
lazily) by the runtime library, at run time.
When resolving statically, the cpu time used by the resolver (and the
C compiler which must compile the resolver's output of several hundred
kB (or more)) is considerable, which is undesirable during program
development. On the other hand, the overhead of the runtime library
is negligible, making this setup suitable for small programs in
production environment.
When resolving dynamically, the run time of the resolver is decreased
considerably (down to a few seconds; depending on the size of the
program and the libraries used). Since the overhead at run time is
only in the order of a few centiseconds, dynamic resolution is very
useful during program development. For large programs (i.e. programs
running more than a second), this run time overhead is of course
insignificant.
When TOM will start using shared libraries, dynamic resolution will
probably become mandatory, making it unsuitable for writing an ls
replacement. But then again, it isn't meant to be used for that (it
is meant as an Objective-C replacement really).
More on TOM at http://tom.ics.ele.tue.nl:8080/. --Tiggr
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.