Related articles |
---|
[2 earlier articles] |
Re: Optimizing Across && And || preston@tera.com (1995-02-18) |
Re: Optimizing Across && And || bill@amber.ssd.csd.harris.com (1995-02-19) |
Re: Optimizing Across && And || bart@cs.uoregon.edu (1995-02-21) |
Re: Optimizing Across && And || whalley@fork.cs.fsu.edu (David Whalley) (1995-02-22) |
Re: Optimizing Across && And || bart@cs.uoregon.edu (1995-02-23) |
Re: Optimizing Across && And || bill@amber.ssd.csd.harris.com (1995-02-24) |
Re: Optimizing Across && And || glew@ichips.intel.com (1995-02-27) |
Re: Optimizing Across && And || bill@amber.ssd.csd.harris.com (1995-02-28) |
Re: Optimizing Across && And || bill@amber.ssd.csd.harris.com (1995-02-28) |
Re: Optimizing Across && And || ryg@summit.novell.com (1995-03-03) |
Optimizer phase ordering assmann@informatik.uni-karlsruhe.de (1995-03-03) |
Re: Optimizing Across && And || leichter@zodiac.rutgers.edu (1995-03-07) |
Re: Optimizing Across && And || preston@tera.com (1995-03-08) |
[6 later articles] |
Newsgroups: | comp.compilers |
From: | glew@ichips.intel.com (Andy Glew) |
Keywords: | C, optimize, design |
Organization: | Intel Corp., Hillsboro, Oregon |
References: | 95-02-110 95-02-184 |
Date: | Mon, 27 Feb 1995 05:50:30 GMT |
From: bill@amber.ssd.csd.harris.com (Bill Leonard)
preston@tera.com (Preston Briggs) writes:
> Writing optimizers is expensive and time consuming.
> Everything I tested was fairly old (had to be, since they were
> released products). How could I expect them to incorporate
> recent research results?
That last reason is the primary one for us. To say it a different way:
The main difficulty with the published research is that it often
requires a major redesign of one's optimizer (or entire back end).
Unfortunately, this means that most research can only be applied if one
is designing a new compiler or optimizer, which doesn't happen very
often.
Having once been responsible for (and still somewhat peripherally involved
with) the optimizer of a production compiler, I can say that mostly it is
this factor that keeps us from using published research results. We try
very hard to make our C compiler perform as much optimization as possible,
and we do read (and understand) published compiler papers.
In fact, when we designed our compiler back end, we based it on the best
published research we could find at the time. But of course now that is
over 10 years old. Fortunately, it seems we made a good choice, because
our compilers seem to do a very good job and are fairly easily enhanced and
retargeted.
You sound like every other compiler writer I have ever dealt with!!!!
:-) (I'm a computer architect who is always trying to get the
compilers to be state of the art, to use the latest optimizations.)
Of course, you can try to design a backend framework that is
extensible. But it always seems to fall by the wayside.
One thing I have tried to encourage compiler folk to do is to make the
back-end recursive or reentrant - have every optimization read in and
write out the same intermediate form, along with arbitrarily
extensible (typed) annotations.
Ideally, the optimizations would not be invoked in a "phase order",
but would instead be invoked with a particular scope (subtree, trace,
hyperblock). Some optimizations might place constraints on the scopes
they accept - truly global optimizations might require full program
scope.
Optimizations should not be done in place, but should be done in a
manner such that they can be easily backed out. Thus, a particular
optimization module suggests an optimization, but the framework may
evaluate several different optimizations, and only take one.
Optimizations should be continuation structured. At various places,
optimization modules may execute "call backs" to the framework, with
queries like "is this code frequently executed?" - the framework
should have the option of saving a continuation, and changing its mind
if, as a result of other decisions, the answer changes.
Optimizations should be coroutine oriented, over and above the
continuation technique described above. Coroutining is one approach
to solving the compile time constraint - a particular optimization may
be invoked, allowed to grind for a pre-determined time, and then
control passed to a second optimization module. But, the first module
should be re-invokeable, from where it stopped off, if it turns out
that there is slack time.
(So often we are in the mode "Do the best optimizations you can in
time XXXX", rather than "Do the possible possible
optimizations". Combinatoric explosion, y'know.)
Actually, it just occurred to me that what I have described is really
a transaction processing paradigm - structure optimizations as
transactions, perform optimistic concurrency control, commit them if
assumptions haven't changed, back them out if it has.
Point is, individual optimizing modules should control the policy -
the framework should.
Simple phase ordering, as is seen in most compilers, is just, of
course, an extremely naive framework.
Setting up such a framework does take a lot of work, and has some cost
in compile time. The problem is, at any point in time such a general
framework will probably only deliver results similar to an ad-hoc
backend structure - where "ad-hoc" means that the structure has been
carefully crafted to support the current state of the art. It is only
over time that the flexible structure would show its advantage, by
being able to incorporate new optimizations more easily.
Preston's comment about "None of these things really matter" is somewhat
true, though, I'd say. The big performance wins come from exploiting
parallelism, either in vector processing, parallel processing, or
pipelining, or reducing memory traffic, say through cache management. If
you have the choice of spending multiple man-years on an optimization that
gets you a 5% improvement versus an optimization that gets you 50%, which
will you choose?
If you are comparing a 5% optimization that helps a wide variety of
real user programs (e.g. one that helps a wide variety of Visual C++
programs), versus a 50% optimization that helps one particular SPEC
benchmark, but nothing else:
(1) Personally, I'd say the 5% optimization.
(2) Professionally, I'd say both.
(3) Realistically, the 50% optimization wins - at least until the next
release of the SPEC benchmarks, which have been carefully designed not
to be vulnerable to that optimization.
--
Andy "Krazy" Glew, glew@ichips.intel.com, Intel,
M/S JF1-19, 5200 NE Elam Young Pkwy, Hillsboro, Oregon 97124-6497.
Place URGENT in email subject line for mail filter prioritization.
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.