Re: Reordering of functions

George Neuner <gneuner2@comcast.net>
Mon, 25 Feb 2008 17:15:36 -0500

          From comp.compilers

Related articles
[5 earlier articles]
Re: Reordering of functions plfriko@yahoo.de (Tim Frink) (2008-02-21)
Re: Reordering of functions plfriko@yahoo.de (Tim Frink) (2008-02-21)
Re: Reordering of functions plfriko@yahoo.de (Tim Frink) (2008-02-21)
Re: Reordering of functions gah@ugcs.caltech.edu (glen herrmannsfeldt) (2008-02-24)
Re: Reordering of functions cfc@shell01.TheWorld.com (Chris F Clark) (2008-02-24)
Re: Reordering of functions Jan.Vorbrueggen@thomson.net (=?ISO-8859-15?Q?Jan_Vorbr=FCggen?=) (2008-02-25)
Re: Reordering of functions gneuner2@comcast.net (George Neuner) (2008-02-25)
| List of all articles for this month |

From: George Neuner <gneuner2@comcast.net>
Newsgroups: comp.compilers
Date: Mon, 25 Feb 2008 17:15:36 -0500
Organization: Compilers Central
References: 08-02-051
Keywords: optimize
Posted-Date: 25 Feb 2008 19:39:33 EST

On Mon, 18 Feb 2008 17:22:52 +0100, Tim Frink <plfriko@yahoo.de>
wrote:


>I've a question about the influence of compiler optimizations that
>reorder functions on the system performance.


>Assume a modern processor with all state-of-the art features like
>prefetching, branch prediction and a superscalar pipeline. Further
>assume that all caches are disabled. Will the program runtime change
>when just the order of functions is changed (without any other code
>transformation)?


>I'm of the opinion that a reordering of function should have little
>influence on the program execution, maybe due to some prefetch effects
>but thes should be marginal. Of course, with caches this situation
>would look different.


It strikes me that your hypothetical processor is a pretty good match
to a modern DSP. Most DSP designs have neither branch prediction nor
traditional code/data cache because they have clocked matched internal
SRAM and if any external RAM is present, it is typically the same.


Some things DSPs typically do have that many traditional CPUs do not
are recognizable loop start/end instructions, a small prefetch code
cache dedicated to (suitably sized) loop bodies, near/far call
instructions, banked memory and multiple buses to fetch code and
(typically) multiple data from separate memory banks simultaneously.


It is very common in DSP development to deliberately place code and
data to take maximum advantage of simultaneous fetch, and to pack
related functions for faster near calling. On VLIW designs, overall
code size may change due to different instruction packing.


I've done a fair bit of DSP programming, and apart from the
aforementioned memory bank and near/far call issues, I can't really
say that I've seen any noticeable impact on speed just from reordering
of functions.


George


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.