|Run time optimizations firstname.lastname@example.org (Sanjay Jinturkar) (1993-04-20)|
|Re: Run time optimizations email@example.com (1993-04-22)|
|Re: Run time optimizations firstname.lastname@example.org (1993-04-23)|
|Re: Run time optimizations email@example.com (1993-04-23)|
|Re: Run time optimizations firstname.lastname@example.org (1993-04-24)|
|Re: Run time optimizations email@example.com (1993-04-28)|
|From:||firstname.lastname@example.org (Jeremy Fitzhardinge)|
|Organization:||Softway Pty Ltd|
|Date:||Wed, 28 Apr 1993 11:23:27 GMT|
email@example.com (Paul Haahr) writes:
>[runtime code gen for bitblt was a win on the 68020, no better than
>interpreted on 68040, probably due to cache effects]
>Anyway, I don't want to disparage the approach of run-time code
>generation, but do want to remind people that as hardware changes,
>engineering trade-offs change.
Good points. However, it does depend on what you are generating code for.
I spend quite a lot of time playing with Byron Rakitzis' pico
implementation, in particular an optimiser pass.
Pico was originally written at Bell Labs. It took as input a C-like
language that describes a set of transformations to be performed on an
image. Transformations can involve simple arithmatic, logical ops, polar
or rectangular coords, trig operations, conditionals, etc. It compiled
the user input into native machine code and ran it. The Bell
implementation generated Vax and WD32000 code I think; Byron's more
limited implementation generates Sparc and Mips code.
Byron's original code was a literal translation into assembly with no
attempt at optimisation. I added strength reduction, constant folding,
loop invarient motion and simple peephole optimisation. After I'd
finished, there was no way an interpretive version was going to get within
an order of magnitude of the compiled code on a Sparc Station 1. There
was all the same sort of tradeoffs as your case, but because there were at
least 512x512 operations (for that sized image) there was a high payoff
for reducing loop overhead, as well as operation time per pixel.
For blit-type operations, there are relatively few variations, so it can
pay just to have one function per operation, each of which encompasses all
the loops. Pico, by its nature, can't do that, so either you interpret
the user input or compile it. The runtime compiler code is not as good
as, say, gcc, but it generates quickly and its results are always going to
be faster than a gcc-compiled interpreter. (Just to blur the distinction,
I added a "portable option" that would generate C source as output, and
pass it to gcc, then map the resulting .o into the address space and run
it. This would have worked modulo bugs in Sun's runtime linking
In conclusion, runtime code generation pays off when there are too many
possibilities at runtime to encode into the compiled source.
Return to the
Search the comp.compilers archives again.