Re: RPN as intermediate code?

lkaplan@BIX.com (Leonard Kaplan)
Fri, 25 Aug 1995 21:34:57 GMT

          From comp.compilers

Related articles
RPN as intermediate code? gorelick@esther.la.asu.edu (1995-08-22)
Re: RPN as intermediate code? lkaplan@BIX.com (1995-08-25)
Re: RPN as intermediate code? jaidi@technet.sg (1995-08-26)
Re: RPN as intermediate code? odersky@ira.uka.de (1995-08-29)
Re: RPN as intermediate code? ukln@rzstud2.rz.uni-karlsruhe.de (1995-08-31)
Re: RPN as intermediate code? hbaker@netcom.com (1995-09-02)
Re: RPN as intermediate code? mejohnsn@netcom.com (1995-09-11)
Re: RPN as intermediate code? bevan@cs.man.ac.uk (1995-09-06)
[1 later articles]
| List of all articles for this month |

Newsgroups: comp.compilers
From: lkaplan@BIX.com (Leonard Kaplan)
Keywords: interpreter, design
Organization: Galahad
References: 95-08-164
Date: Fri, 25 Aug 1995 21:34:57 GMT

Noel Gorelick wrote:
>I am struggling trying to convert my C interpreter from a tree-based
>design to an intermediate-code design.


(snip)


>I now want to move to storing (and executing) some form of intermediate
>code. I really like the idea of encoding into an 'execution string', like
>bc does, and reverse polish works great for this, as long as I stick to
>encoding expressions. But this model falls apart when I get into things
>like conditionals and loops, and I am completely baffled how to do
>short-circuit logic like this. For instance, I can compile the following
>code into the RPN strings shown (which execute easily):


(snip)


>However, when I get into conditionals and loops, I have absolutely no idea
>how to proceed. Do I really have to get all the way down to the level of
>computing jump offsets and implement these as conditional gotos?


You could either use that approach (effectively both compile _and_ assemble
for your runtime "machine", I suppose), or use conditional branches and
store the target addresses into your dictionary (just compile). I've used
the following approach successfully:


Traditionally, Forth implements conditionals using two branch instructions,
call them BRANCH and ZBRANCH. BRANCH is unconditional, the target address
(or offset) is always "jumped to". ZBRANCH only performs the branch if the
current top-of-stack (TOS) is zero. An "if" statement like this:


        if (a > b){
            stuff;
            stuff;
        }


Would generate code something like:


        a
        b
        >
        ZBRANCH xx
        stuff
        stuff
        (this is location xx)


A more complex case,


        if (a > b){
            stuff1;
            stuff2;
        }else{
            stuff3;
            stuff4;
        }


Would generate


        a
        b
        >
        ZBRANCH xx
        stuff1
        stuff2
        BRANCH yy
        (this is location xx)
        stuff3
        stuff4
        (this is location yy)


A loop like


        while (a > b){
            stuff5;
            stuff6;
        }


looks like


        (this is location xx)
        a
        b
        >
        ZBRANCH yy
        stuff5
        stuff6
        BRANCH xx
        (this is location yy)


And so on. One disadvantage to this approach (or maybe not, you haven't
said whether your interpreter scans and runs line-by-line, or does an
entire tokenizing pass, _then_ runs) - you must always scan far enough
ahead to resolve any necessary addresses. In Forth, this is not a problem,
the system traditionally has two states, compiling and executing. While
compiling, it even uses the expression stack to hold both branch target
locations, and locations where the target addresses are to be stored (the
word following the "BRANCH" instruction, essentially). Also, Forth's
conditional words (IF, THEN, ELSE, and so on) actually _execute_ during
compilation, and generate the above structures into the dictionary. They
are called "compiling words" for that reason.


>And why can't I find any literature about this sort of thing? All the
>compiler books I can find say 'Here is reverse polish. Its a good
>intermediate representation because its stack-like', but then they go off
>and do something completely different.


I've often wondered that myself. Interpreters don't usually get much more
than a brief mention, anyway, in most compiler books. I learned how to set
up the above branch implementation by looking at the source code for
various Forth implementations.


>[I've used conditional jumps with ofsets. They're ugly, but they're not
>hard
>to do. The alternative is to have tokens like WHILE ENDWHILE, IF THEN
>ELSE ENDIF and have the interpreter scan over parts you're skipping and
>keep
>a stack of places to retur tom but that's a lot of scanning and skipping.
>-John]


That would certainly work, though you would end up with more work at
runtime - unless you resolve things during a prepass, at which point you're
pretty-much back to jumps and offsets anyways!.


-Len
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.