Re: How to implement lexical closures?

George Neuner <gneuner2@comcast.net>
Sat, 15 May 2010 03:52:50 -0400

          From comp.compilers

Related articles
[3 earlier articles]
Re: How to implement lexical closures? cr88192@hotmail.com (BGB / cr88192) (2010-05-09)
Re: How to implement lexical closures? torbenm@diku.dk (2010-05-10)
Re: How to implement lexical closures? grom358@gmail.com (grom) (2010-05-11)
Re: How to implement lexical closures? cr88192@hotmail.com (BGB / cr88192) (2010-05-12)
Re: How to implement lexical closures? gene.ressler@gmail.com (Gene) (2010-05-12)
Re: How to implement lexical closures? gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-05-13)
Re: How to implement lexical closures? gneuner2@comcast.net (George Neuner) (2010-05-15)
Re: How to implement lexical closures? gneuner2@comcast.net (George Neuner) (2010-05-15)
Re: How to implement lexical closures? gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-05-16)
Re: How to implement lexical closures? gneuner2@comcast.net (George Neuner) (2010-05-17)
Re: How to implement lexical closures? cfc@shell01.TheWorld.com (Chris F Clark) (2010-05-17)
Re: PL/I, was How to implement lexical closures? genew@ocis.net (Gene Wirchenko) (2010-05-17)
Re: How to implement lexical closures? gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-05-19)
[1 later articles]
| List of all articles for this month |

From: George Neuner <gneuner2@comcast.net>
Newsgroups: comp.compilers
Date: Sat, 15 May 2010 03:52:50 -0400
Organization: A noiseless patient Spider
References: 10-05-031 10-05-072 10-05-076
Keywords: storage, symbols, comment
Posted-Date: 15 May 2010 10:05:24 EDT

On Thu, 13 May 2010 19:52:58 +0000 (UTC), glen herrmannsfeldt
<gah@ugcs.caltech.edu> wrote:


>Gene <gene.ressler@gmail.com> wrote:
>(snip)
>
>> A closure is a pair consisting of code and an activation record. The
>> activation record is an array of slots for values of parameters (and
>> local variables if your language has them). In lanuages like yours
>> with lexical nesting, there will also be a frame pointer to the
>> activation record of the lexically enclosing scope.


>I never actually tried writing code like that, so I am not so sure I
>understand closures. I do know the problem that comes up in PL/I
>with ENTRY variables. (Similar to C function pointers, but PL/I
>allows internal procedures. In the case of recursion, for example,
>you get the activation record at the time of the call. Because of
>this complication, Fortran (as of 2003, and I believe 2008) doesn't
>allow the use of an internal procedure name as an actual argument.


PL/I entry variables were a kind of closure, but I don't recall
offhand whether they could be used outside of the call chain that
created them.




>But maybe I am still not so sure. As you say, "consisting of code
>and an activation record." Is "pointer to code" good enough for that
>definition, or is it more than that?


As Gene said, technically a closure is simply a pairing of a function
with a lexically defined runtime environment. However a closure can
be a logical compiler construct which has no identifiable "object" in
the program.


A simple function pointer, as in C, is an example of a "degenerate"
closure: one for which the environment is empty. C closures (function
pointers) are 1st class, but they are not "full" because they carry no
environment.


Closures start to become interesting when you have nested functions
which need to access non-local, non-global stack variables. In the
simplest case, the environment is the stack of activation frames.
Non-local variables are scoped lexically, but functions might be
recursively invoked, so there needs to be a way for inner functions to
locate the activation frames of outer functions. In a language with
nested functions, a closure is the function code plus its method of
finding non-local stack variables: either a lexical scope pointer in
each function's activation frame or through a separate array of scope
pointers called a "display".




Closures become most interesting when they persist after the code
which created them has returned: that is, where functions can create
new functions that can be invoked in a call chain separate from the
one that defined them. These are known as "1st class" or "full"
closures.


To accomplish this, the non-local variables which make up the custom
execution environment of the function must persist. One way to do
this is simply to retain the stack of activation frames that existed
when the closed-over function was created and to restore that stack
prior to executing it. You can implement this by stack copying or by
allocating activation frames in a linked list on the heap.


Persistent stack implementations, however, are wasteful if there are
(or may be) many stack variables that are not part of the state of the
closure ... this can happen if closures are created at the bottom of a
deep call chain. Keeping the whole stack or any significant portion
of it can quickly get expensive if you create many closures. If you
allocate frames in a list on the heap, you also ignore the hardware
stack which the CPU is optimized to use for call/return. It also
introduces the need to GC your stack.


So another way to implement persistent closures is to analyze the
function and extract the non-local variables into a heap allocated
structure. The structure is allocated in the outer lexical scope at
which the first state variable is defined in the source. The function
which is closed over is modified to take a pointer to the state
structure as a parameter and to look for its non-local variables in
the structure. Any outer functions that access the state variables
must also be modified to look in the structure, however for them the
pointer to the structure will be on the stack until the function which
allocated it returns. This implementation is more efficient in
general call/return because it can utilize the native CPU stack, but
it requires significant analysis and code rewriting by the compiler.




There is also an in-between case where the closure persists beyond the
function that creates it but does not persist beyond all the functions
of the call chain. In this case, you can optimize away the creation
of the closure by a process called variously "lambda lifting" or
"closure conversion" (mentioned previously by Andy Johnson). To
"convert" a closure, you lift all the non-local state variables into
the innermost lexical scope that encloses all uses of the function,
rewrite the function so that references to the non-local variables are
references to parameters instead. Then rewrite all the call sites to
pass the state variables as parameters. This transforms the closure
into a simple stack function. Again, this requires some significant
analysis and code rewriting by the compiler.






>I have never seen that PL/I ENTRY variable refered to as a closure.


AFAIK, the term "closure" began with Lisp and came into popular use
through university teaching of Lisp and related branch languages like
Scheme, ML, etc. A number of early imperative languages had
closure-like constructs, but they referred to them by different names
- "function object", "procedure pointer", etc.


Objects in OO languages are isomorphic to closures in that they
encapsulate state and bind it to functions. The main difference is
that most OO languages require a template for objects (the class
definition) whereas closures are typically ad hoc constructs.




>An even more interesting feature in PL/I is the use of
>a LABEL variable pointing to a label in a different procedure
>or a different activation level in the same procedure.


This is not a "closure" but a "continuation" ... a related but
different construct which is the basis for co-routines and/or
cooperative threading.


George
[PL/I doesn't do garbage collection, so you can't use an entry
outside the block where its variables were allocated. Looks to
me like this came straight out of Algol60. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.