Re: backend question
24 Nov 2002 01:25:18 -0500

          From comp.compilers

Related articles
[5 earlier articles]
Re: backend question (2002-11-17)
Re: backend question (Joachim Durchholz) (2002-11-20)
Re: backend question (David Chase) (2002-11-20)
Re: backend question (Fermin Reig) (2002-11-24)
Re: backend question (felix) (2002-11-24)
Re: backend question (Fergus Henderson) (2002-11-24)
Re: backend question (2002-11-24)
Re: backend question (Mark) (2002-11-24)
Re: backend question (Nick Maclaren) (2002-11-24)
Re: backend question (Joachim Durchholz) (2002-11-24)
Re: backend question (Nick Maclaren) (2002-11-26)
Re: backend question (Fergus Henderson) (2002-12-01)
Re: backend question (Fergus Henderson) (2002-12-01)
[4 later articles]
| List of all articles for this month |

Newsgroups: comp.compilers
Date: 24 Nov 2002 01:25:18 -0500
Organization: University of California, Riverside
References: 02-11-063 02-11-078 02-11-099 02-11-104
Keywords: C, tools
Posted-Date: 24 Nov 2002 01:25:18 EST

Joachim Durchholz <> wrote:
+ wrote:
+> Joachim Durchholz <> wrote:
+> +
+> + [C] begins to bite if you're doing
+> + concurrency, exceptions, fancy integer arithmetic, tail-call
+> + elimination, or state machines.
+> Standard C lacks an indirect jump (though one can be kludged a switch
+> statement).
+ Agreed.
+> Anything that can be done in say MIPS assembly language
+> can be done in gcc, which has an indirect goto.
+ Indirect goto is good for state machines (and it's been used for that
+ purpose). It doesn't help with the other issues.
+> Where necessary, one can generates comments that preserve the memory
+> of what got abstracted away.
+ It's not abstracted away in the generated C code, it's abstracted away
+ by the language. For example, tail-call elimination goes like this:
+ For the last subroutine call in a routine, don't push the parameters
+ and do a JSR; instead, pop the return address and the parameters of
+ the caller, push the parameters of the callee, push the popped return
+ address, then do a direct jump to the subroutine. (You can optimize
+ the push-pop sequence if the new stack frame is not larger than the
+ old one.) Impossible in C since C abstracts away the call stack. (A
+ Good Thing for human-written programming languages. Unsuitable for
+ compiling functional languages where loops are represented as
+ recursions.)

The assembly language output of, say, a MIPS-targeted compiler
translates almost one-for-one into the following subset of GCC, which
I call "C-minor":
    * three-address ALU instructions, e.g., "x = y + z"
    * load and store instructions, e.g.:
                    "x = p[i];"
                    "p[i] = x;"
    * unconditional jumps, e.g., "goto L5;"
    * conditional jumps, e.g., "if( x ) goto L5;"
    * indirect jumps, e.g., "goto* p;" which can be kludged via
        standard switch statements.
The widths of three-address and of load/store instructions can be
controlled via casts where necessary.

+ You can work around a lot of limitations of C - it's going to be ugly
+ and inefficient, but it's possible. Tail-call elimination is not
+ feasible (unless gcc offers another extension to handle exactly this,
+ of course *g*).

The assembly language code for managing the run-time stack consists of
instructions that increment and decrement the stack pointer, which
translate to C instructions such as: "SP = SP + size;" If you compile
that statement with a typical C compiler, it might increment something
other than the register you have in mind, e.g., it might increment a
memory location. But that's a matter of implementation.

When you generate MIPS assembly code as your target langage, you must
assemble it with an assembler that interprets SP according to your
intent. The same holds if you translate to the C language,
i.e. compiling the target code with an ordinary conforming C-compiler
will produce correct but inefficient behavior. (The efficiency will
be considerably better than instruction-by-instruction
interpretation.) It's not difficult to write a simple translator that
translates C-minor to almost one-for-one to assembly language that
assembles to machine code of normal efficiency. That's particularly
true for simple RISC architectures -- achieving efficiency with more
complex architectures requires a more complex translator.

Tom Payne
[If you know exactly what compiler and target machine you're compiling
to, you can indeed do lots of gross but effective hacks in C. But if
you want your translated code to be at all portable, you can't try to
go mucking with the stack. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.