Re: When to do inline expansion

jdean@bergen.cs.washington.edu (Jeffrey Dean)
Tue, 21 Sep 1993 16:19:40 GMT

          From comp.compilers

Related articles
When to do inline expansion jhall@whale.WPI.EDU (1993-09-14)
Re: When to do inline expansion zstern@adobe.com (1993-09-20)
Re: When to do inline expansion salomon@silver.cs.umanitoba.ca (1993-09-20)
Re: When to do inline expansion davidm@questor.rational.com (1993-09-20)
Re: When to do inline expansion jfc@athena.mit.edu (1993-09-21)
Re: When to do inline expansion jgmorris+@cs.cmu.edu (1993-09-21)
Re: When to do inline expansion jdean@bergen.cs.washington.edu (1993-09-21)
Re: When to do inline expansion salomon@silver.cs.umanitoba.ca (1993-09-22)
Re: When to do inline expansion preston@dawn.cs.rice.edu (1993-09-22)
Re: When to do inline expansion cliffc@rice.edu (1993-09-22)
Re: When to do inline expansion rfg@netcom.com (1993-09-25)
Re: When to do inline expansion ssimmons@convex.com (1993-09-27)
| List of all articles for this month |

Newsgroups: comp.compilers
From: jdean@bergen.cs.washington.edu (Jeffrey Dean)
Keywords: optimize
Organization: University of Washington
References: 93-09-063 93-09-069
Date: Tue, 21 Sep 1993 16:19:40 GMT

John Clinton Hall writes
> How long should a function be (in number of statements) for it to be a
> reasonable speed optimization to perform inline expansion?


zstern@adobe.com (Zalman Stern) writes:
|> It depends a lot on your goals and compiler technology.


Craig Chambers and I have a tech-report describing a technique for
making inlining decisions that is more effective than a simple
source-length heuristic. It's available via anonymous ftp from
cs.washington.edu in tr/1993/05/UW-CSE-93-05-05.PS.Z. Although the
work was done in the context of a pure object-oriented language
(Self), the techniques we describe should be applicable to C and C++.


The basic idea of our approach is to experimentally inline routines
which look profitable (based on a source-length heuristic, but with a
higher threshold). As compilation of the inlined code progresses,
optimizations that were enabled due to static information propagated
from the call site are recorded, and the execution time and code space
savings due to these optimizations are estimated. When the compiler
has finished processing the inlined code, an estimate of the space
costs of inlining the routine is made, and a decision of whether the
routine was profitable to inline is made, based on the collected
inlining experiment data:


    a) estimated execution time if the routine was inlined,
    b) estimated execution time if the routine was not inlined,
    c) space cost for the inlined routine,
    d) space cost for the call if the routine was not inlined.


This data more accurately portrays the execution time and code space
costs of inlining, allowing better inlining decisions to be made. The
decisions to inline or not to inline still needs to be based on what
you are looking for: optimum performance, sparing no compile time
costs, or quick turn-around, with some sacrifice in potential
performance. However, having the data described above identifies the
tradeoffs associated with inlining or not inlining a particular
routine more clearly than a source-level inlining heuristic.


The final key idea of our approach is that this information is saved
in a persistent database. We've developed a technique called type
group analysis which allows us to categorize what static information
was able to be exploited from the call site, and this is used in
construction of the database key. Future inlining decisions for calls
with similar amounts of static information are made by consulting the
information in this database (a-d, above), using a key constructed
from the static information available at the call site.


The approach was particularily effective at identifying general
purpose routines which looked like poor inlining candidates using a
source-length heuristic, but were actually used in restricted
circumstances which made their inlining much more favorable. An
example of such a routine is the to:By:Do: routine (which implements a
for loop with a step): initially this looks like a poor candidate for
inlining due to its length (it has one branch for a postive step, one
branch for a negative step, and error handling code if the routine is
called with a step of 0). When to:By:Do: is called with a step
argument which is known to be a constant (which is usually the case),
though, constant folding and dead code elimination remove two of the
three cases, and the routine becomes very compact and definitely
desireable to inline.


We hope to have a revised version of the tech. report available within
the next month, including better performance numbers and a more
in-depth discussion of how the technique could be applied to other
languages.


    -Jeff


Jeffrey Dean (jdean@cs.washington.edu) Graduate Student
Dept. of Computer Science and Engineering University of Washington
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.