Re: the Evil Effects of Inlining

mcg@ichips.intel.com (Steven McGeady)
Mon, 6 May 91 17:55:16 PDT

          From comp.compilers

Related articles
[5 earlier articles]
Re: the Evil Effects of Inlining mac@eleazar.dartmouth.edu (1991-05-03)
Re: the Evil Effects of Inlining pardo@june.cs.washington.edu (1991-05-03)
Re: the Evil Effects of Inlining compres!chris@crackers.clearpoint.com (1991-05-04)
Re: the Evil Effects of Inlining carter@cs.wisc.edu (1991-05-05)
Re: the Evil Effects of Inlining pardo@june.cs.washington.edu (1991-05-05)
Re: the Evil Effects of Inlining ea08+@andrew.cmu.edu (Eric A. Anderson) (1991-05-06)
Re: the Evil Effects of Inlining mcg@ichips.intel.com (1991-05-06)
| List of all articles for this month |

Newsgroups: comp.compilers
From: mcg@ichips.intel.com (Steven McGeady)
Keywords: design, optimize
Organization: Compilers Central
References: <1991May1.035622.25021@daffy.cs.wisc.edu> <1991May2.180508.17100@rice.edu>
Date: Mon, 6 May 91 17:55:16 PDT

I've just read the thread on inlining (through 5 May 91), and have a few
comments to add, as an implementor:


  - respondants don't seem to be making a distinction between inlining as a
      programmatic, user-specified extension, and inlining as a transparent,
      compiler-implemented optimization.


      While closely related, I feel these two types of inlining must be
      addressed separately:


- user-specified inlining is as good as the user's understanding
of his or her program. In situations where the user has a deep
understanding of the performance behaviour of the program under
study, user-directed inlining can be a powerful tool. When I
wrote 'inline', a stand-alone C-to-C inliner, I carefully
studied several algorithms, including 'compress'. Careful
profiling followed by inlining resulted in a 10% performance
improvement, even in this carefully-optimized program.


- heuristic inlining is only as good as the heuristic (duh). Our
research is pointing out that we haven't found a good heuristic
yet without using profiling feedback. We've tried to synthesize
a heuristic from call-graph, register-pressure, and size information,
without repeatable success (i.e. over a broad selection of programs).
Heuristics that include profiling input (weighted dynamic call tree)
can repeatably produce improvements in most programs, without causing
serious regressions. (Our compiler does global (inter-module)
inlining with a two-pass model).


        Unfortunately, users often think they know more about their programs than
        they actually do, and many don't have the tools, or are too lazy to
        measure their programs. Many inlining decisions users make are just plain
        wrong. Heuristic inliners like gcc's make the user's task easier: try it
        both ways, and pick the fastest. This doesn't validate the practice of
        inlining, it merely provides commentary on the effectiveness of gcc's
        heuristic (which is: not particularly).


  - several respondants have noted that good interprocedural dataflow analysis
      can yield better results. In theory, I agree (on processors where calls
      are relatively cheap), however, true REF/DEF dataflow information can
      quickly become intractable (or at least very difficult) in a large
      program, when attempted across the entire program (for C, when tracking
      all points-to information). So if Global DFA is limited to a procedure,
      inlining frequently-traversed arcs on the call-graph can dramatically
      improve the overall effectiveness of DFA-based optimizations.


  - along the same lines as the last point, inlining can also expose many other
      worthwhile optimizations that can't profitably be done on an intermodule
      basis. In particular, until call tailoring becomes a reality (including
      debug support!) I think the utility of some classes of inlining to be high,
      when modified with profile information.


Summary:
- 'inlining' means two different things
- user-inlining is effective only for sophisticated users
- compiler heuristic inlining is currently hampered by poor heuristics
- profile information considered essential for inlining heuristics
- intermodule global DFA considered difficult to intractable
- intelligent profile-driven inlining is a Good Thing


S. McGeady
i960 Software Architecture Group
Intel Corp.
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.