Related articles |
---|
strange compiler optimizations qjackson@wave.home.com (Quinn Tyler Jackson) (1998-08-10) |
Re: strange compiler optimizations mtimmerm@microstar.no-spam.com (Matt Timmermans) (1998-08-13) |
Re: strange compiler optimizations clark@quarry.zk3.dec.com (Chris Clark USG) (1998-08-19) |
From: | Chris Clark USG <clark@quarry.zk3.dec.com> |
Newsgroups: | comp.compilers |
Date: | 19 Aug 1998 16:17:51 -0400 |
Organization: | Digital Equipment Corporation - Marlboro, MA |
References: | 98-08-042 98-08-083 |
Keywords: | optimize, storage, architecture |
Quinn Tyler Jackson wrote...
>Why on earth did OPTIMIZE FOR SPEED do so poorly with PLPM and OPTIMIZE FOR
>SIZE do so well, where it behaved as expected for LPM?
To which, Matt Timmermans <mtimmerm@microstar.no-spam.com> replied:
> Most likely, optimizing for size allowed some important inner loop, and all
> the functions it calls, to remain in the cache while executing.
And, actually, this is a very important effect. With many modern
systems, there are numerous layers of caches and alignment concerns,
many of which can totally swap all resonable optimization efforts when
the code or data just happens to violate them. It is both the real
boon and real bane of benchmarksmanship. I can recall several illus-
trative examples, which I'll highlight with just a few.
First, when I was working on the "acc" (uopt) optimizer for DEC, we
had one spec benchmark that happened to be favorably aligned for our
compiler. That kept our performance competitive with the better
staffed GEM optimizer. Anyway, that particular benchmark was
significant enough that we tiptoed very carefully when making
improvements to the optimizations which affected that particular test.
It was worthwhile crippling optimizations to keep the alignment.
Fortunately, we eventually added enough improvements that the
cumulative effect preserved the alignment and optimized the code
sequence at the same time.
Second, Kuch and Associates also exists partially because of the
alignment effect. One of their original tools tuned certain array
options to naturally block them for different processors caches. The
speedups were phenomenal for certain benchmarks and helped certain
vendors list their machines as faster performing. Of course, over
time, all the vendors supported KAP versions of their products and the
benchmark suite was adjusted to compensate.
Third, most vendors have benchmarking (aka performance and tuning)
groups that work with large customers. For tools like Parametric's
Pro-Engineer or Oracle the database engine, it is worthwhile for the
vendors to devote engineering resources to figuring out how to twist
and tune the compiler and operating system (and even hardware system)
to get maximum performance (or best price/performance). Such projects
can run for weeks or months trying each possible configuration of
compiler switches.
The end result is that most benchmark numbers are highly suspect and
any reputable benchmarker will recommend that you benchmark any given
system against your own representative data. The same rule of thumb
applies to tuning and is partially why most tuning experts recommend
profiling the complete system (not just a prototype) before making
adjustments to your code. The optimization which causes a cache miss
(or collision) in one version of the system may not have the same
effect in a slightly modified system.
Good algorithms remain good algorithms, but the constants can be
varied widely.
-Chris Clark
************************************************************************
Compiler Resources, Inc. email: compres@world.std.com
3 Proctor St. http://world.std.com/~compres
Hopkinton, MA 01748 phone: (508) 435-5016
USA 24hr fax: (508) 435-4847
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.