Memoization of pure-functional procedures

Ray Dillinger <bear@sonic.net>
23 May 2002 01:45:29 -0400

          From comp.compilers

Related articles
Memoization of pure-functional procedures bear@sonic.net (Ray Dillinger) (2002-05-23)
Re: Memoization of pure-functional procedures bmd@cs.kuleuven.ac.be (Bart Demoen) (2002-05-27)
Re: Memoization of pure-functional procedures bobduff@shell01.TheWorld.com (Robert A Duff) (2002-05-31)
Re: Memoization of pure-functional procedures joachim_d@gmx.de (Joachim Durchholz) (2002-06-02)
| List of all articles for this month |

From: Ray Dillinger <bear@sonic.net>
Newsgroups: comp.compilers
Date: 23 May 2002 01:45:29 -0400
Organization: Compilers Central
Keywords: functional, optimize, comment
Posted-Date: 23 May 2002 01:45:29 EDT

I'm experimenting with an intermediate-code analyzer that attempts to
identify sections of code which are purely functional, ie, blocks
which have no side effects and do not rely on any values other than
their parameters to compute their result.


As currently written, it identifies pure-functional "blocks" of
maximal size in the intermediate representation. It's a
tree-traversing process; a procedure call is considered side-
effecting unless there is a pure-functional block which encompasses
all of the the procedure's code, so you traverse the call tree with a
depth-first traversal and build from the leaves toward the roots.
Self-Recursive procedures can be identified as pure-functional, but a
different method has to be used: basically if you can prove
pure-functionality for all but the self calls, you can assume it for
the self calls. Sets of mutually-recursive procedures my
implementation is still stumbling on but I'm pretty sure I know what
do do about them - it's just a generalization of the self-recursive
case.


I'm working on this because I think I can get some good optimizations
if the compiler memoizes procedures known to be pure-functional. In
other words, if there are no side effects, and no dependence on
information outside the call frame, then you can emit a version of the
function that caches the last N argument-result pairs and does a
lookup instead of a computation if it sees a set of arguments it's got
in its cache (where N is probably determined by a few iterations of
actually running an instrumented version of the program and analyzing
the results).


This should allow dramatic time and/or space savings on a
moderate-sized class of recursive procedures with branching (such as
the "classic" implementation of the Fibonacci function) and iterative
procedures that can get into loops where they call a function each
time but only change the arguments occasionally, as well as nabbing a
lot of multiple call idioms where functional procedures get called
with the same arguments from different contexts.


But I can't find mention of memoization applied as an optimization
methodology in any of the compiler books I own, so I'm worried that I
may be reinventing a very old and not- very-useful wheel or chasing
down a blind alley. The closest concept I have good sources on is
common subexpression elimination, which seems never to think of a
non-inlined procedure call as a subexpression. Any pointers anyone can
give me would be appreciated.


Bear
[PL/I let you declare procedures as REDUCIBLE, which meant that they
were pure functions, and Fortran functions were supposed to be pure
functions (although most compilers didn't treat them that way due to
all of the bad code that people write.) But I think that was so that
the compiler could do common subexpression and dead code optimization
across function calls, not memoization. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.