Related articles |
---|
Re: Re: genetic compilation rbw3@dana.ucc.nau.edu (Brock) (2000-04-01) |
Re: Re: genetic compilation joachim.durchholz@halstenbach.com.or.de (Joachim Durchholz) (2000-04-14) |
Re: genetic compilation gneuner@dyn.EXTRACT.THIS.com (2000-04-20) |
Re: Re: genetic compilation felix@anu.ie (felix) (2000-06-11) |
Re: Re: genetic compilation joachim.durchholz@halstenbach.com.or.de (Joachim Durchholz) (2000-06-14) |
From: | "Joachim Durchholz" <joachim.durchholz@halstenbach.com.or.de> |
Newsgroups: | comp.compilers |
Date: | 14 Apr 2000 23:48:31 -0400 |
Organization: | Compilers Central |
References: | 00-04-019 |
Keywords: | optimize, architecture, comment |
I wrote:
> > I once read something about compilers that tried out equivalent
> > machine code sequences and measured actual performance to decide
> > which worked best. This could be used as a basis for a genetic
> > algorithm. Of course, [...] it wouldn't work for cross-compilation
> > anyway.
Brock <rbw3@dana.ucc.nau.edu> wrote:
> Why wouldn't it work for cross-compilation? It seems that the
> performance measurement could be done just as well using a table
> of measured performance values instead of doing the actual tests,
> thereby making it quite possible.
There is no such thing as a table of measured performance values for
modern microprocessors.
I know of two reasons (there may be others, I'm not really a guru for
this).
1) Modern microprocessors execute instructions in pipelines, i.e. they
keep executing several instructions at the same time, where every
instruction is in a different phase of execution.
This gives problems with conditional jumps: The processor can stop
filling the pipeline, or it can speculatively assume that the branch
will be taken and throw away it's pipeline if the ALU detects that the
branch was not taken after all, or it can execute both branches in
parallel until it detects which one will be taken. If you want
to predict the time that a processor will take to execute a set of
instructions, you have to model all these pipeline stages, which is a
difficult task even if you don't account for the fact that most
microprocessor suppliers don't give the necessary information, and that
details may change from revision to revision.
The story is even worse. Since a few years, microprocessors usually keep
a record of what branches were taken or not taken in the past, and
speculatively execute one or the other branch depending on what's
recorded. Details of the cache are usually not available, and I wouldn't
be surprised if Intel didn't change them occasionally from one stepping
to the next.
2) You'd have to consider resource contention. For example, early
Pentium processors have two types of pipelines that operate in parallel:
One pipeline for integer-only arithmetic, and one pipeline that can do
both integer and floating-point. Floating-point instructions took much
longer than integer ones; so it was recommended practice to order the
instructions so that you had one floating-point instruction per three to
five non-floating-point ones. Computing the optimal sequence of
instructions will quickly get you into a combinatorial explosion of
cases to consider unless you find an ingenious algorithm that's
guaranteed to find the fasted instruction sequence (and you'd have to
invent another algorithm for every processor model).
Regards,
Joachim
[It's surely possible to model the performance of a microprocessor, but
I tend to agree that it's become more trouble than it's worth, and I'd
guess that you could get a reasoanable estimate of which of several
possible code sequences was fastest by a simple inexact model. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.