Re: Internal code representation. (Simon Tatham)
Mon, 23 Oct 1995 08:22:29 GMT

          From comp.compilers

Related articles
Internal code representation. (1995-10-14)
Re: Internal code representation. (1995-10-23)
Re: Internal code representation. (Jaap van Ganswijk) (1995-10-23)
| List of all articles for this month |

Newsgroups: comp.compilers,comp.programming,comp.lang.misc
From: (Simon Tatham)
Keywords: interpreter
Organization: Trinity College, Cambridge
References: 95-10-082
Date: Mon, 23 Oct 1995 08:22:29 GMT

Mark Grosberg <> wrote:
>I am currently coding yet another (experimental) language interpreter.
>In all of my previous intepreters I used a RPN-based stack representation
>once code was absorbed and a stack to do evaluation.

>Any help or hints towards a good internal code representation would be
>appriciated, preferrably one that is both efficient and supports lazy and
>iterated parameter evaluation.

Have you tried a syntax tree? Each node in the tree may have many
children, and in the simplest case it corresponds exactly to the
grammar given in the parser description. For example

expr: term | expr '+' term ;
term: factor | term '*' factor ;
factor: NUMBER | '(' expr ')' ;

might give some kind of a syntax tree like

/ | \
term '*' factor
/ / | \
factor '(' expr ')'
                      / / | \
                NUMBER=2 expr '+' term
/ \
term factor
/ \
factor NUMBER=4

to denote the expression 2*(3+4). However, this tree can be collapsed
a long way, since "expr", "term", "factor" and "NUMBER" all have the
same basic data type, and so you can collapse the tree for the
productions "expr->term", "term->factor", "factor->'('expr')'" and
"factor->NUMBER", giving the reduced tree

/ | \
NUMBER=2 '+' expr
/ | \

which will allow a simple recursive evaluator to quickly execute the
"code". The same principle can equally well apply to statement-type
code rather than expressions.

Also, you can (presumably) try to spot if large branches of the tree
are equal to other large branches, and fiddle the internal
representation so that that code is not evaluated twice. (As long as
it has no side effects, of course, but that's obvious.)

Anyone catch me in a silly mistake?
Simon Tatham E-mail to:
Trinity College, Cambridge, CB2 1TQ, England
[Recursively running the tree is equivalent to interpreting the equivalent
RPN, except that the data stack is implicit in the call stack. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.