Newsgroups: | comp.compilers |
From: | stephen@estragon.uchicago.edu (Stephen P Spackman) |
Keywords: | optimize, linker |
Organization: | University of Chicago CILS |
References: | 92-05-123 92-06-003 |
Date: | Mon, 1 Jun 1992 20:27:53 GMT |
juul@diku.dk (Anders Juul Munch) writes:
|Type safety as an OS feature? An OS with builtin type safety would be a
|one-language OS. There can be no canonical type description format,
|because the concept of type is much too wide for that. One view of what
|types are is that a type is a set of values. The criterion for dividing
|the set of all values into type subsets is entirely in the hands of the
|language designer, who might choose non-distinct types (as is the case
|with OO typings) and who might choose to use abstract types containing no
|information on physical structure.
This turns out to be untrue. The criterion for data structure (note:
*structure*, not *type*) viability turns out to be that the structure is
readily decodable, and describing them in terms of the code that is
required to do a canonical traversal of the representation works out just
great (you can see this idea starting to surface in recent work on tagless
garbage collection, but to be really useful you need to go to higher-order
functions so you can abstract out all of the "semantic" work and leave
just the raw traversal strategy). [At this point I should say "see my
thesis" but it's a pretty awful thesis.]
The operating system needs to provide for representation translation, and
the hooks (but not the semantics) for type safety. After all, checking
preconditions/postconditions entails runtime overhead and is NOT required
to ensure correctness in the presence of static type safety.
All of which is to say, if I can mandate an intermediate code I can use
*that* as my type-implementation-description language; and if I use just a
suitable subset, I can still get all the safety guarantees I want.
| Using a high-level language for the lowest level implies that everyone
|will use that language, because other HLLs would be an order of magnitude
|slower, unless they are very much like the first language.
Not if the system is well-constructed. Compilation DOES exist. But as it
happens this is NOT what I'm suggesting: I'm suggesting a common virtual
machine at what is now considered the intermediate code level.
After all, what could be more logical than to move code generation into
the operating system's CPU-driver? An operation is requested, now we have
to set up the appropriate hardware state to carry it out....
|>Once you have a uniform virtual machine with strong typing, the problem
|>goes away. (It is specifically the absence of strong typing in C and in
|>Unix communication channels that makes Unix source-level portability not
|>quite work in this regard - that and the fact that C is far from a good
|>intermediate code).
|
|C may have its problems used as intermediate code, but do you really think
|ML would be better? The higher the level you make the lowest, the more
|careful you have to be not to inhibit alternative coding strategies.
ML would be better, but it's still a source language. An intermediate
language wants to (among other things) have a usable graph-structured
binary reporesentation, and provide for explicitly disjunctive
implementations.
(I think we're just having trouble with "high" vs. "low" again. Strong
typing, yes, functional values, yes, but not "high level" - not in the
sense of having a lot of stuff built in....)
andrew@rentec.com (Andrew Mullhaupt) writes:
|Your uniform virtual machine may entail some performance annoyances, as
|far as my limited knowledge reveals. If the uniform virtual machine has no
|"endian-ness", (which I presume is included in the word "uniform") but the
|underlying processors do have, then some implementations of the virtual
|machine have to bite the bullet and swap things, while others do not.
|Quite clearly, the chips on which this burden falls may be at a
|disadvantage which their manufacturers may view in financial terms.
There are several responses to this, at different levels. The first is
that endian-ness (on the byte level anyway) is chimeric: by negating
addresses and biasing them differently (oh, and running strings downwards
in memory on one side) you can provide direct interoperability - you can
persuade yourself of this in a couple of minutes with a pencil and paper,
I think. Of far more concern is that floating point semantics (such as
they are) get even more washed out - but honestly I think it's reasonable
enough to bind FP code to a specific processor architecture anyway (since
there's not really another option!).
Actually the BEST way of handling endianness (not to mention word size,
character set, graphic bitmap conventions and all the other goodies) is to
note that it USUALLY doesn't matter which convention is used, so long as
all the support code is generated to match. IF it doesn't, use "abstract"
representations that get bound to the local convention and translated
LAZILY during term transit (network byte order is a BAD solution since
networks have a strong mostly-homogeneous tendency resulting from
corporate discounting policies! Best to assume architectural hit until
proven inappropriate).
In the cases where it DOES matter algorithmically (when bit-banging or
writing explicit packing code, e.g.), you can either tolerate the
performance hit and fix a representation (requiring a simulation of the
chosen architecture in some cases), or, more reasonably, have the front
end produce a DISJUNCTIVE translation, and let the optimiser pick
whichever representation is cheaper at actual generation time - which will
usually be the native one, of course.
Ideally, the second alternative would itself be derived automatically by a
(portable) transformation engine, but that's getting much more ambitious.
--
stephen p spackman Center for Information and Language Studies
stephen@estragon.uchicago.edu University of Chicago
[The common intermediate code that is translated into object code for any
target is traditionally known as UNCOL, an idea that has been tried many
times starting about 1959. Every attempt has failed. OSF is trying again
with ANDF, which may work but history is against them. Perhaps a more
encouraging approach is the IBM Sys/38 and AS/400 approach which has a
common rather high-level virtual machine code which is compiled on the fly
to object code, but only on CPUs designed to support the particular
virtual machine. -John]
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.