Related articles |
---|
optimizing compilers for low power design idatarm@gmail.com (2014-06-15) |
Re: optimizing compilers for low power design kaz@kylheku.com (Kaz Kylheku) (2014-06-15) |
Re: optimizing compilers for low power design ivan@ootbcomp.com (Ivan Godard) (2014-06-15) |
Re: optimizing compilers for low power design Pidgeot18@verizon.com.invalid (2014-06-15) |
Re: optimizing compilers for low power design derek@_NOSPAM_knosof.co.uk (Derek M. Jones) (2014-06-16) |
Re: optimizing compilers for low power design walter@bytecraft.com (Walter Banks) (2014-06-16) |
Re: optimizing compilers for low power design gneuner2@comcast.net (George Neuner) (2014-06-18) |
Re: optimizing compilers for low power design andrewchamberss@gmail.com (2014-06-20) |
Re: optimizing compilers for low power design gah@ugcs.caltech.edu (glen herrmannsfeldt) (2014-06-20) |
[3 later articles] |
From: | Ivan Godard <ivan@ootbcomp.com> |
Newsgroups: | comp.compilers |
Date: | Sun, 15 Jun 2014 11:19:45 -0700 |
Organization: | A noiseless patient Spider |
References: | 14-06-003 14-06-004 |
Keywords: | optimize, architecture, comment |
Posted-Date: | 15 Jun 2014 19:21:47 EDT |
On 6/15/2014 8:43 AM, Kaz Kylheku wrote:
> On 2014-06-15, idatarm@gmail.com <idatarm@gmail.com> wrote:
>> Abstract: We describe an algorithm for optimizing compilers for low
>> power design. The algorithm can be applied to almost any c compiler
>> and in particular we target gcc compiler. The algorithm works by
>> modifying optimization table lookup of gcc. This works in theory. We
>
<snip>
< An excellent critique omitted. >
Only one addition: the amount of active power (switching gates) and
the amount of passive power (leakage) varies by fab process, but as a
rule the proportion of the passive part increases with smaller feature
sizes. Smaller than roughly the 28nm node, actual execution of ops
increasingly comes for free.
However, leakage is (roughly) proportional to area or equivalently, gate
count), so by far the greatest power saving available derives from
reducing area/count for a given amount of computation. In a modern OOO
chip, the great bulk of the area/count is devoted to figuring out what
you are going to do next, getting everything ready to do it, and then
dealing which what you have done. The amount spent in actually doing it
(in an adder, for example) is near trivial. So changing op A into op B
doesn't win anything, because the hardware surrounding A or B dominates
the power, and even if B uses less active power than A the hardware to
do A is still there leaking away.
This is why so much has shifted to in-order architectures: they have
much less area/power required around the functional units. The problem
with them is that they cannot achieve the ILP of an OOO architecture;
all that cruft around the adders really does do something, it's just
expensive. Exchanging ops won't help; you have to do something
architectural, like our Mill, to get the ILP of OOO and the area/power
of a DSP.
Ivan
[Are we on the verge of reinventing VLIW? -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.