Re: why not inline all functions?

Joerg Schoen <f81@ix.urz.uni-heidelberg.de>
11 Jun 1998 16:11:47 -0400

          From comp.compilers

Related articles
why not inline all functions? sanvitam@std.teradyne.com (Mark Sanvitale) (1998-06-09)
Re: why not inline all functions? cliff.click@Eng.Sun.COM (Clifford Click) (1998-06-11)
Re: why not inline all functions? p.toland@computer.org (Phillip Toland) (1998-06-11)
Re: why not inline all functions? f81@ix.urz.uni-heidelberg.de (Joerg Schoen) (1998-06-11)
Re: why not inline all functions? bje@cygnus.com (Ben Elliston) (1998-06-11)
Re: why not inline all functions? ayers@incert.com (Andy Ayers) (1998-06-11)
Re: why not inline all functions? mcdirmid@beaver.cs.washington.edu (1998-06-11)
Re: why not inline all functions? portland@uswest.net (Thomas Niemann) (1998-06-11)
Re: why not inline all functions? wclodius@aol.com (1998-06-11)
Re: why not inline all functions? ian@five-d.com (1998-06-18)
[1 later articles]
| List of all articles for this month |

From: Joerg Schoen <f81@ix.urz.uni-heidelberg.de>
Newsgroups: comp.compilers
Date: 11 Jun 1998 16:11:47 -0400
Organization: University of Heidelberg, Germany
References: 98-06-032
Keywords: performance, practice

Mark Sanvitale <sanvitam@std.teradyne.com> wrote:
: Functions are great for making written code (C, C++, etc.) mode
: readable and structured, however, they do not seem to make much sense
: when you get down to the raw machine code which actually is executed
: by a processor.


: As far as my understanding of the matter goes, the most basic way to
: slow down a processor is to make it execute an instruction besides the
: one immediately following the current instruction, thus, why not make
: a compiler which turns every function into an inline function? This
: would save you the overhead inherent in a traditional function call
: (push everything defining the current state of the processor on the


In my experience "pushing the current state on the stack" refers only
to variables that reside in processor registers. If the function is
small, it will be inlined (at sufficiently high optimization level)
and no pushs are necessary. If the function is big and a function call
is done, it is more likely that some time is spent in the function and
the processors registers are used therein for performance. Thus it is
reasonable to "free" them for reusage in the function by pushing away
them.


: stack, make fresh copies of the parameters for the function, and,
: afterwards, pop things off the stack to return the processor to the
: pre-function state, not to mention losing the chance to take advantage
: of any instruction prefetching the processor might do).


: The output of such a compiler would be larger binary files (since
: every call to a function would expand to the entire function body)
: however the execution time for such a program should be improved
: (relative to a non-inlining compiler) by a factor proportional to the
: number of function calls in the program.


No, that's not true. Consider a long function or a short one with a
loop that is executed a couple of times. You then can neglect the cost
of calling the function versus the time spent in the function itself.


: Now, a "inline everything" scheme might run into some roadblocks when
: it comes to external functions which are resolved at link time and the
: notion of dynamic linking is not compatible with such a method.


I know that some compilers have an "ucode" format that is different to
the usual object file format (which is used in libraries). As far as
my understanding goes, compilers can do much more with the ucode
format in the linking stage, I think they can also do inlining.


: Still, I think compilers should try to inline every function it can
: without depending on the programmer to specify a function as "inline"
: (C++).


As our moderator pointed out, you have to consider the cost of
loading new instructions into the cache. If the function is a separate
code, it will be in the cache after the first call and probably stay
there. That improves performance compared to the case of inlined
functions that consist of separate code blocks that have all to be
loaded into the cache.


                Joerg Schoen
E-mail: Joerg.Schoen AT tc DOT pci DOT uni-heidelberg DOT de
Web-Page: http://www.pci.uni-heidelberg.de/tc/usr/joerg
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.