Related articles |
---|
Fast Interpreted Code ssinghani@viewlogic.com (1991-04-17) |
Re: Fast Interpreted Code pardo@june.cs.washington.edu (1991-04-23) |
Newsgroups: | comp.compilers |
From: | pardo@june.cs.washington.edu (David Keppel) |
Keywords: | interpreter, threaded code |
Organization: | Computer Science & Engineering, U. of Washington, Seattle |
References: | <3035@redstar.cs.qmw.ac.uk> <1991Mar31.180635.5944@cs.rochester.edu> <1991Apr22.003548.14803@iecc.cambridge.ma.us> |
Date: | Tue, 23 Apr 91 02:06:21 GMT |
ssinghani@viewlogic.com (Sunder Singhani) writes:
>[Our threaded code isn't fast enough. What's faster?]
As far as I know, threaded code gives the fastest primitives/second
dispatch rate on a variety of architectures. The general techniques for
making things faster (that I know of!) are to change things so that the
dispatch rate can go down without changing the work that gets done (or use
hardware, but we'll ignore that for the moment.)
* Use a different v-machine instruction set
The overhead of interpreting is almost nothing in generic PostScript
imaging code because all the time is spent in non-interpretded
primitives. If you can characterize your key operations (perhaps
info in [Davidson & Fraser ??, Fraser, Myers & Wendt 84] can help
you analyze for common operations instead of the more normal time in
routines) then you can re-code your virtual instruction set to have
as primintives oeprations that are performed frequently.
* Dynamic compilation to native machine code
This is what is done in ParcPlace System's Smalltalk-80
implementation, [Deutsch & Schiffman 84] also Insignia Solution's
8086 interpreter.
Dynamic compilation suffers from the need to do compilation at
runtime: a compiler that produces better code will take longer to
run and the compile time contributes to the overall program runtime.
Also, program text isn't shared, even if multiple instances are
running simultaneously.
* Native-coding key routines
If you believe that your program spends 80% of its time in a few key
routines, then compiling just these routines -- statically, adding
them to the primitive set, statically adding them as library
routines, or dynamically -- can improve performance substantially
[Pittman 87].
5 Citations follow:
%A Robert Bedichek
%T Some Efficient Architecture Simulation Techniques
%J Winter '90 USENIX Conference
%D 26 October, 1989
%W Robert Bedichek.
%W Pardo has a copy.
%X Describes a simulator that uses threaded-code techniques to emulate
a Motorola 88000. Each 88k instruction is executed in about 20 host
(68020) instructions. Discusses techniques used to get the simulation
down from several thousand host instructions in many other
simulators.
%A Jack W. Davidson
%A Chris W. Fraser
%T Eliminating Redundant Object Code
%J POPL9
%P 128-132
%A Peter Deutsch
%A Alan M. Schiffman
%T Efficient Implementation of the Smalltalk-80 System
%J 11th Annual Symposium on Principles of Programming Languages
(POPL 11)
%D January 1984
%P 297-302
%X Dynamic translatin of p-code to n-code (native code).
Resons for not using straight p-code or straight n-code:
* p-code is smaller than n-code (<= 5X).
* The debugger can debug p-code, improving portability.
* Native code is faster (<= 2X). Reasons include
special fetch/decode/dispatch hardware;
p-machine and n-machine may be very different, e.g.,
stack machine vs. register-oriented.
* Threaded code does reduce the cost of p-code fetch/decode.
Does not help with operand decoding.
Does not allow optimizations to span more than one instruction.
[pardo: that's not technically true, but each optimized
instruction must exist in addition to the unoptimized version.
That leads to exponential blowup of the p-code. Example: delayed
branch and non-delayed branch versions of Robert Bedichek's 88k
simulator.]
The system characteristics:
* The time to translate to n-code via macro expansion is about the
same as the execute time to interpret the p-code.
* (pg 300:) Self-modifying code (SMC) is deprecated but used in a
``well-confined'' way. Could indirect at more cost. Use SMC on the
machines where it works, indirection where SMC.
doesn't.
* Performance is compared to a ``straightforward'' interpreter.
What's that?
%A Christopher W. Fraser
%A Eugene W. Myers
%A Alan L. Wendt
%T Analyzing and Compressing Assembly Code
%J CCC84
%P 117-121
%A Thomas Pittman
%T Two-Level Hybrid Interpreter/Native Code Execution for Combined
Space-Time Program Efficiency
%D 1987
%J ACM SIGPLAN
%P 150-152
%X Talks about native code execution vs. various kinds of interpreting
and encoding of key routines in assembly.
Hope this helps!
;-D on ( This is all open to interpretation ) Pardo
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.