Re: optimizing compiler against average assembly programmer.

Jerry Leichter <leichter@smarts.com>
9 Jul 1997 23:18:04 -0400

          From comp.compilers

Related articles
[9 earlier articles]
Re: optimizing compiler against average assembly programmer. cef@geodesic.com (Charles Fiterman) (1997-07-04)
Re: optimizing compiler against average assembly programmer. vkarvone@raita.oulu.fi (1997-07-04)
Re: optimizing compiler against average assembly programmer. rhyde@cs.ucr.edu (1997-07-04)
Re: optimizing compiler against average assembly programmer. WStreett@shell.monmouth.com (1997-07-08)
Re: optimizing compiler against average assembly programmer. monnier+comp/compilers/news/@tequila.cs.yale.edu (Stefan Monnier) (1997-07-08)
Re: optimizing compiler against average assembly programmer. charles.marslett@tempe.vlsi.com (1997-07-08)
Re: optimizing compiler against average assembly programmer. leichter@smarts.com (Jerry Leichter) (1997-07-09)
Re: optimizing compiler against average assembly programmer. rhyde@cs.ucr.edu (1997-07-09)
Re: optimizing compiler against average assembly programmer. rhyde@cs.ucr.edu (1997-07-09)
Re: optimizing compiler against average assembly programmer. monnier+comp/compilers/news/@tequila.cs.yale.edu (Stefan Monnier) (1997-07-13)
Re: optimizing compiler against average assembly programmer. dietz@interaccess.com (1997-07-13)
Re: optimizing compiler against average assembly programmer. cef@geodesic.com (Charles Fiterman) (1997-07-13)
Re: optimizing compiler against average assembly programmer. nino@complang.tuwien.ac.at (Marinos Yannikos) (1997-07-13)
[1 later articles]
| List of all articles for this month |
From: Jerry Leichter <leichter@smarts.com>
Newsgroups: comp.compilers,comp.lang.asm.x86
Date: 9 Jul 1997 23:18:04 -0400
Organization: System Management ARTS
References: 97-06-071 97-06-082 97-06-110 97-06-140 97-07-023
Keywords: optimize, architecture

| Here's what's wrong with the superoptimizer:
|
| (1) As the title above suggests, the SO produces the *shortest*
| possible program. This is not always the same thing as the
| *fastest* possible program.


The superoptimizer does a fairly dumb choice over all possible code
sequences and chooses the one that is "best" according to some measure.
The published example uses "shortest number of bytes" to define "best".
If, given a code sequence, you can determine how long it takes to
execute, you can trivially modify the superoptimizer to find the code
sequence that runs fastest.


For most widely-used machines, execution models that will give you exact
timing measurements are available. For typical RISC machines, the
models depend more on cache interactions than details of the instruc-
tions used; for CISC machines, instruction timing can vary widely. In
either case, even experienced assembly language programmers are forced
to use heuristics - the interactions are just too complex. (In some
real-time applications, where *exact* timing information is needed, the
only alternative is to stick to simple machines and simple memory
architectures. The result is that the average performance is much less
than could be attained by more aggressive design - but the average case
is also the worst case, and it's the worst case that's the concern.)


If, for some odd reason, you wanted the program that had the fewest
possible number of 1 bits in its machine-code representation, the super-
optimizer could find that for you, too.


| (2) Optimization is provably NP-Complete and that's how the SO runs.
| In the example above, the sample function they computed required
| 40 hours to compile to three machine instructions.... True
| optimization is an intractible problem.


It's intractable for humans, too. So what? In few cases do we really
care to find the *best possible* program; we typically just want to get
some large improvements. Computers doing dumb searches can do that.
Computers using more clever searches can do better. (As I recall, the
superoptimizer didn't really try to be clever.)


In any case, NP-completeness is a statement about assymtotics. For a
fixed size input and any reasonable measure of "best program", the
computation tree is bounded. It's large, but so what?


Computers using moderately intelligent brute force play chess (at the
moment) better than any human being. Do they play "optimal" chess?
Highly unlikely - they certainly aren't *trying*.


If the goal, as in the superoptimizer, is to find the optimal solution,
then, yes, the costs are impractically high for all but small examples.
Then again, humans would have trouble consistently finding the optimal
solution even for small examples!


If the goal is to find *very good* solutions for moderately small pieces
of code (where a "solution" is defined strictly in terms of inputs and
outputs, and "moderately small" these days is probably "a few tens of
instructions in the obvious implementation"), then it would be quite
straightforward to write programs that beat all humans most of the time,
and almost all humans all of the time. The reason such programs aren't
used is economic and psychological: People are usually unwilling to run
compilers for tens of hours, but they are willing to pay programmers to
work for weeks on similar pieces of code. One of these days, that will
change. (People don't do design-rule checking for IC's any more - they
let machines run for hours, days, even weeks doing that. It's not much
different.)


| ...BTW, shortly after the article appeared (discussing the "signum"
| function, as I recall). I rewrote the signum function in a different
| fashion and it only took me about 18 hours to come up with my solution
| (which was just as good, but obviously not better than, the
| superoptimizer's).


Since the article was written in 1987, readily-available hardware has
gotten at least 100 times faster. The same program would now run in
about 25 minutes. Has your brain sped up to match? :-)


-- Jerry
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.