Re: How to implement lexical closures?

"BGB / cr88192" <>
Sun, 9 May 2010 17:39:32 -0700

          From comp.compilers

Related articles
How to implement lexical closures? (grom) (2010-05-06)
Re: How to implement lexical closures? (andy johnson) (2010-05-09)
Re: How to implement lexical closures? (George Neuner) (2010-05-09)
Re: How to implement lexical closures? (BGB / cr88192) (2010-05-09)
Re: How to implement lexical closures? (2010-05-10)
Re: How to implement lexical closures? (grom) (2010-05-11)
Re: How to implement lexical closures? (BGB / cr88192) (2010-05-12)
Re: How to implement lexical closures? (Gene) (2010-05-12)
Re: How to implement lexical closures? (glen herrmannsfeldt) (2010-05-13)
Re: How to implement lexical closures? (George Neuner) (2010-05-15)
[6 later articles]
| List of all articles for this month |

From: "BGB / cr88192" <>
Newsgroups: comp.compilers
Date: Sun, 9 May 2010 17:39:32 -0700
References: 10-05-031
Keywords: storage
Posted-Date: 10 May 2010 01:23:19 EDT

"grom" <> wrote
> I'm working on a toy interpreter (
> and trying to implement lexical closures (
> zemscript/source/browse/#svn/branches/lexical_scope) . Currently when
> the function is created it copies the symbol table. So the following
> code works:


> What I can't work out is how I can get the second example to work
> without breaking the first example.

typically closures are implemented by having each function in the call-stack
create its own local frame, which is usually destroyed when the function
exits, but are not destroyed when captured by a closure.

whenever frames are nested (such as when creating a closure), then the inner
frame contains a reference to the outer frame, and the outer frame is made
to persist.

in many interpreters with nested frames like this, a reference to a variable
will consist of 2 numbers, namely the number of frames to step "outward", as
well as the position within the frame.

function f()
        var x, y;
        function bar(z)

would then see a binding environment like:
  (x y))

so, z=<0,0>, x=<1,0> and y=<1,1>
z is in the local frame, and x&y are in the captured parent frame.

so, an interpreter could be implemented like this:
when compiling a function, if there are any closures (either generally, or
one can be more specific and check if they refer to any bindings in the
parent frame), then have the frame be allocated in a persistent location
(such as on the heap), rather than on the stack (or, having it set to be
destroyed on return or similar).

for creating the closure, a special opcode can be used or similar, which
creates a new closure object, and stuffs the current binding frame into the

when called, then its new local frame is created, and the "outward"
reference in the new frame is set to the reference stored in the closure

side note (unlikely relevant here, but "maybe" interesting):

I had before considered adding support for closures to my C compiler, but it
was inconvinient and I never got around to doing so (sadly, I have more
ended up using my C compiler as a code-processing tool than for actually
compiling all that much C). much like dynamic types (also supported by said
compiler), it is likely to mostly go unused...

however, I also support BGBScript (technically, it is a script language
based on ECMAScript / JavaScript, and I am working on trying to make it
standards conformant).

my C compiler then more serves the role of producing databases so that the
BGBScript implementation can see into the C toplevel, among other uses
(mostly related to reflection, ...).

reasons for its limited use for compiling code:
most C ends up being compiled by native static compilers (which are
generally far more reliable);
C has turned out to not be a very good language for scripting, given it can
neither be effectively entered interactively (mostly syntax and semantics),
and its relatively expensive compile times (due mostly to headers). these
issues limit its use primarily to loading scripts at app startup, but in
this case one can just as easily statically compile the code into a DLL or
similar, also limiting this particular use-case.

BGBScript, OTOH, compiles much faster and is more suitible for interactive
entry, causing some bias in its favor in this case.

there is a "no headers" hack avaiable for the C compiler though, which can
speed up compile times (useful for scripting and "eval" and similar), but
still doesn't address interactive entry, and still requires it to be used to
mine information (in this case, typically done at build time).

and, just recently pulled of a massive speedup, of around 40x for larger
inputs, via strategic use of hash tables (and making one of the tables
bigger), mostly after noting that compile times when including "windows.h"
sucked hard (granted, "windows.h" adds a good 2MB or so to the size of the
preprocessor output, untold how much code the preprocessor grinds through,
at present the preprocessor taking up the majority of the compile time...).
(pp output gets 2x larger, but took easily 20x longer, whereas running times
for smaller modules is much less notably effected by the optimizaton, but
are now typically within the 100-500ms range vs 500-1500ms as before...).

however, oddly enough, they may be both "mutually useful":
C compiler benefits BGBScript by providing direct access to the toplevel;
BGBScript benefits C compiler by providing actually usable scripting
facilities and interactive entry.

so, it is extra useful when I can type crap into BGBScript and have access
to C code and data (almost as-if I had an interactive C entry prompt or
as well, both are "siblings" WRT their internals (both are based on mostly
common, and often shared, facilities and code).

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.