From: | pardo@cs.washington.edu (David Keppel) |
Newsgroups: | comp.compilers |
Date: | 19 May 1996 23:19:11 -0400 |
Organization: | Computer Science & Engineering, U of Washington, Seattle |
References: | 96-05-061 96-05-119 |
Keywords: | Java, UNCOL |
The moderator opines:
>[If you're willing to live with the cost of interpretation and a somewhat
>constrained environment, there have been lots of quite effective
>machine-independent byte code environments ...]
The Deutsch-Schiffman Smalltalk virtual machine implementation provides
the flexibility of a dynamically-typed garbage-collected strongly-typed
bounds-checked, buzzword-compliant system, at about 10% the performance
of C (which has only "some" GC and buzzword compliance). Self, while
less widely-used than the D-S ST-80 VM, does more and costs less (that
is, it implements a "harder" VM and runs it faster). Both of these
systems use dynamic compilation, and both run much faster than
conventional interpreters. Franz's Oberon implementation performs
load-time n-code generation with modest overhead and reasonable code
quality. The ST-80 and SELF systems also allow you to "freeze" program
images for later use, provide heterogenous process migration, etc.,
again, features that are unavailable under the conventional C model.
And, lest we forget, shell scripts, perl programs, Word Macros (even
viruses :^), PostScript(tm), the VM for System/38 and AS/400, and so on
are all examples of successful virtual machine (VM) codes.
Thus, the UNCOL lesson is that you *can* use a VM, but the performance
won't be as good as with direct n-code generation in which you can
optimize across both the source level constructs and the machine-level
constructs. One interesting question, then, is what kind of
performance hit is required in order to get various features? Another
is how much performance are you willing to give up? For example, I
have lots of C programs that I'd gladly run at "only" 90% the speed if,
in return, I got portability. On the other hand, there are some
applications where I really do care about the last 10%. Like arguments
might be made for robustness, etc., namely, that dramatically better
robustness of many applications is possible if you're willing to give
up some performance.
Another difficult VM problem is ensuring that applications will work
correctly when they are started on a new (untried) system. For
example, a vendor might test their programs on SPARC/Solaris and
x86/DOS but still be hesitant to promise, without testing them, that
their products will work on next year's hot new architecture or
operating system. One solution to that problem is to define a VM very
precisely, so that it will behave the same on all machines (same data
type sizes and rounding/overflow characteristics, etc.), but the
greater degree of specification makes it harder to map the VM
efficiently onto many systems. This is the approach taken by Colusa's
Omniware. (I think, BTW, we're in much better shape than we were 25
years ago, since most of today's machines use 2's complement arithmetic
and 8-bit bytes, and memory is relatively cheap.) Another solution is
"crippleware", which uses the VM format but only runs on "approved"
systems.
To summarize: efficiency and robustness are both major considerations.
UNCOL says that you can't get both portability and high efficiency;
experience says that you can get both portability and modest
efficiency. However, "the world" is still debating the cost of array
range checks instead of using them by default and turning them off
where profiles show they're a performance problem. I suspect that
this, rather than pure technological difficulty, is the reason why ANDF
is suffering. Likewise, robustness across "untested" platforms is a
serious problem, both from a technological standpoint and from the
standpoint of product reputation.
Some references:
%A William Berg
%A Marshall Cline
%A Mike Girou
%T Lessons Learned from the OS/400 OO Project
%T October 1995
%J Communications of the ACM
%V 10
%P 54-64
%V 38
%A Harvey Bratman
%T An Alternate Form of the ``UNCOL Diagram''
%J Communications of the ACM
%V 4
%N 3
%D March 1961
%P 142
%A M. E. Conway
%T A Proposal For An UNCOL
%J Communications of the ACM (CACM)
%V 1
%N 10
%D October 1958
%P 5-8
%A Craig Chambers
%T The Design and Implementation of the SELF Compiler,
an Optimizing Compiler for Object-Oriented Programming Languages
%R Ph.D. dissertation
%I Stanford University, Department of Computer Science
%D March 1992
%A S. H. Dahlby
%A G. G. Henry
%A D. N. Reynolds
%A P. T. Taylor
%T The IBM System/38: A High-Level Machine
%J IBM System/38: Technical Developments
%P 47-50
%D 1978
%O Reprinted in [Siewiorek:82]
%A Peter Deutsch
%A Alan M. Schiffman
%T Efficient Implementation of the Smalltalk-80 System
%J 11th Annual Symposium on Principles of Programming Languages
(POPL-11)
%D January 1984
%P 297-302
%A Michael Franz
%T Technological Steps toward a Software Component Industry
%E Jurg Gutknecht
%B Programming Languages and System Architectures
Springer Lecture Notes in Computer Science No. 782
%P 259-281
%D 1994
%A M. I. Halpern
%T Machine Independence
%J Communications of the ACM
%D 1965
%N 8
%P 782
%A Urs Ho\\*:lzle
%A Craig Chambers
%A David Ungar
%T Optimizing Dynamically-Typed Object-Oriented Languages With
Polymorphic Inline Caches
%R Proceedings of the European Conference on Object-Oriented
Programming (ECOOP)
%E P. America
%I Springer-Verlag
%C Geneva, Switzerland
%P 21-38
%K olit self ooplas ecoop91
%D July 1991
%A Urs Ho\\*:lzle
%D 1994
%T Adaptive Optimization for SELF: Reconciling High Performance with
Exploratory Programming
%I Stanford University, Computer Science Department
%K cache
%R Ph. D. Thesis
%E Daniel P. Siewiorek
%E C. Gordon Bell
%E Allen Newell
%B Computer Structures: Readings and Examples
%D 1982
%I McGraw-Hill, Incorporated
%A Charles M. Shub
%T Native Code Process-Originated Migration in a Heterogenous
Environment
%J Proceedings of the 1990 Computer Science Conference
%D February 1990
%P 266-270
%A J. Strong
%A J. Wegstein
%A A. Tritter
%A J. Olsztyn
%A O. Mock
%A T. Steel
%T The Problem of Programming Communication with Changing Machines; A
Proposed Solution
%J Communications of the ACM (CACM)
%V 1
%N 8, 9
%P 12-18, 9-15
%D 1958
%A Marvin Theimer
%A Barry Hayes
%T Heterogeeous Process Migration by Recompilation
%D March 1992
%R CSL-92-3
%I Xerox PARC
%C Palo Alto, California
%Q Colusa Software
%T Omniware: A Universial Substrate for Mobile Code
%R Colusa Software White Paper, available as of May 1996 via WWW URL
http://www.cs.washington.edu/homes/pardo/rtcg.d/papers.d/colusa-omniware.ps.gz
%D 1995
%P 13
;-D on ( Virtuous Machines ) Pardo
[From what I've seen, UNCOL-ish systems work OK if you have a single source
language, e.g. Smalltalk, or a single target, e.g. multiple compilers sharing
a back end. It's when you try NxM that you get heat death. Re Java, don't
forget that Java promises that you can statically verify that a Java bytecode
module won't do any pointer nasties, something that C and C++ certainly
don't. Static verification lets the interpreter run faster since it needn't
do those checks at runtime. -John]
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.