Re: Rich versus Simple ASTs

haberg@math.su.se (Hans Aberg)
29 Jul 2006 14:06:43 -0400

          From comp.compilers

Related articles
Rich versus Simple ASTs johan.tibell@gmail.com (Johan Tibell) (2006-07-28)
Re: Rich versus Simple ASTs haberg@math.su.se (2006-07-29)
Re: Rich versus Simple ASTs jakacki@gmail.com (Grzegorz Jakacki) (2006-07-29)
Re: Rich versus Simple ASTs holgersiegel74@yahoo.de (Holger Siegel) (2006-07-29)
| List of all articles for this month |
From: haberg@math.su.se (Hans Aberg)
Newsgroups: comp.compilers
Date: 29 Jul 2006 14:06:43 -0400
Organization: Mathematics
References: 06-07-091
Keywords: parse, AST
Posted-Date: 29 Jul 2006 14:06:43 EDT

<johan.tibell@gmail.com> wrote:


> data Expr = Lam Id Expr -- lambda abstraction
> | App Expr Expr -- function application
> | ...
>
> or the richer representation:
>
> data Expr = Lam [Id] Expr -- lambda abstraction
> | App Expr Expr -- function application
> | ...
>
> where [Id] denotes a list of identifiers
>
> Would it be useful to have an AST representation like the second?


I have tried it, in a theorem prover, for heads that bind variables, which
I call a "binders". Whatever you choose, it is just a matter of
implementation convenience, at least in the case of a lambda calculus only
(see below, though). It turns out that implementing binders is very
complicated, so it is simplest to have a binder that just binds a single
variable. So I would suggest to start off with the first variation,
binding one variable at a time. (I think that the theorem prover Otter
<http://www-unix.mcs.anl.gov/AR/otter/> perhaps is using a multivariate
representation, but it does not really handle binders. You might want to
check out some other theorem provers, to see what they do:
<http://www-unix.mcs.anl.gov/AR/others.html>
<http://en.wikipedia.org/wiki/Automated_theorem_proving>)


A second requirement that I have in my implementation is that the internal
closures should be able to be written out (for debugging purposes, both
for implementer and user). Then I think (I experimented with several
models) I settled on the single variable binder internally, with a
mechanism that collects the variables in a single binder, if possible.


It might be the case that one has an operator which binds some variables
and not others, and does not have an internal structure, enabling one to
split it up as a sequence of binders that bond a single variable. Then one
would be forced to implement the second variation: all variables at once
in a single head of which some are bound (in different ways), and others
are not.


So in general, I would suggest you to start off with the first
variation, binding one variable at a time, until you see how it works.
Unless implementing the simpler case of lambda calculus makes it possible
to move to the multivariable binder case directly, and there are
some distinct implementation advantages of that approach.


--
    Hans Aberg


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.