Superoptimizer (was: optimizing compiler against assembly programmer.)

pardo@cs.washington.edu (David Keppel)
13 Jul 1997 11:43:40 -0400

          From comp.compilers

Related articles
optimizing compiler against iaverage assembly programmer. gclind01@starbase.spd.louisville.edu (1997-06-15)
Re: optimizing compiler against average assembly programmer. gclind01@starbase.spd.louisville.edu (1997-06-30)
Re: optimizing compiler against average assembly programmer. debray@cs.arizona.edu (1997-06-30)
Re: optimizing compiler against average assembly programmer. rhyde@cs.ucr.edu (1997-07-04)
Superoptimizer (was: optimizing compiler against assembly programmer.) pardo@cs.washington.edu (1997-07-13)
Re: Superoptimizer (was: optimizing compiler against assembly programm qua@microunity.com (1997-07-16)
| List of all articles for this month |

From: pardo@cs.washington.edu (David Keppel)
Newsgroups: comp.compilers,comp.lang.asm.x86
Date: 13 Jul 1997 11:43:40 -0400
Organization: Computer Science & Engineering, U of Washington, Seattle
References: 97-06-071 97-06-110 97-06-140 97-07-023
Keywords: performance, bibliography

Randy Hyde <rhyde@cs.ucr.edu> wrote:
>[Superoptimizers aren't practical because
> (a) it's shortest not fastest
> (b) it's NP-complete
> (c) they use a small subset of machine instructions.]


Massalins' SO used "shortest" but other metrics will do if you can
constrain them to not produce longer and longer trials. Henry Baker
suggested using actual running time [Baker92].


Massalin's SO also used pruning heuristics. Thus, while the
worst-case complexity is "bad", the observed complexity is much
better. For example, a sequence is pruned if it contains a
subsequence that has already been shown to have a shorter equivalent
(no, I don't know how the matching is done). I don't know what other
pruning techniques are possible (note the original poster mentioned
this in the context of chess computers, which tend to use pruning).


Massalin's SO also optionally took the input sequence and first tried
to shrink "parts" of it so that by the time you tried for sequences
with full equivalence you were approaching a shorter sequence.


The basic probabalistic testing mechanism involves generating a
sequence as machine code and then testing it by executing it. Henry
ran SO on (at least) Motorola 68020, Western Electric WE32000 and
Intel 80386. It ran about 50K trials/sec on a 20MHz(?) 68K. If a
sequence passed this test (run a sequence with a few inputs) then it
needs to be checked for full equivalence, but everything that passed
the probabalistic tester was equivalent -- no false hits, so the cost
of full equivalence testing is negligable. Henry did it by hand and
that still wasn't the performance bottleneck!


Massalin's SO used a narrow subset of the 68k but Granlund & Kenner's
SO [GrKe92] used conditional branches and most other instructions
except indirect branches.


Superoptimizers tend to rediscover known tricks but do so
automatically. They're not used widely but that may reflect their
mysterious nature and lack of automatic profile application as much as
it reflects their poor performance -- either long running time or
inability to perform useful transformations.


;-D on ( Soup Or Optimizer? ) Pardo


%A Henry G. Baker
%T Precise Instruction Scheduling wihtout a Precise Machine Model
%J Computer Architecture News
%V 19
%N 6
%D December 1992


%A Torbj\:orn Granlund
%A Richard Kenner
%T Eliminating Branches using a Superoptimizer and the GNU C Compiler
%D June 1992
%J Proceeedings of the ACM SIGPLAN Conference on Programming Language
Design and Implementation
%P 341-352


%A Henry Massalin
%T Superoptimizer: A Look at the Smallest Program
%J Proceedings of the Second International Symposium on Architectural
Support for Programming Languages and Operating Systems (ASPLOS II)
%D 5-8 October 1987
%P 122-127




--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.