C as a target language

Thomas Charles CONWAY <conway@cs.mu.OZ.AU>
14 Jun 1996 16:08:21 -0400

          From comp.compilers

Related articles
Re: Java virtual machine as target language for C/C++ kik@zia.cray.com (1996-05-08)
Re: Java virtual machine as target language for C/C++ dw3u+@andrew.cmu.edu (Daniel C. Wang) (1996-05-27)
Re: Java virtual machine as target language for C/C++ dave@occl-cam.demon.co.uk (Dave Lloyd) (1996-06-08)
C as a target language conway@cs.mu.OZ.AU (Thomas Charles CONWAY) (1996-06-14)
| List of all articles for this month |

From: Thomas Charles CONWAY <conway@cs.mu.OZ.AU>
Newsgroups: comp.compilers
Date: 14 Jun 1996 16:08:21 -0400
Organization: Compilers Central
References: 96-05-061 96-05-163 96-06-016
Keywords: UNCOL, C

Dave Lloyd (Dave@occl-cam.demon.co.uk) wrote:

>> I'm surprised no one has mentioned C as a resonably sucessful
>> multi-langauge multi-target UNCOL. f2c,p2c,m2c, Scheme->C, sml2c,
>> Mercury, ghc, .... all

> Foot shot with the first example! C is terribly limited
> in scope and has some really nasty holes in its type
> system. Important information gets lost when squeezing
> down to C. Net result is that if you feed f2c to a good
> C compiler, you will do worse on the sort of numeric
> fortran code than if you have a natural Fortran compiler
> which understands complex arithmetic, arrays, fortran
> no-alias parameter passing, equivalence and common blocks,
> loops with a defined counter.

It is true that the translation process throws away useful information
about the source (We compile Mercury to C - Mercury is a single
assignment language with nondeterminism - there is no way to directly
capture these propeties in C). On the other hand, you end up throwing
away the same (or more) information when you generate assembly language
(or machine code).

If you compile to "high level" C - using C's for/while loops, etc,
then the loss of this information is bad, since the optimizer in the C
compiler doesn't have access to all the right information. It is quite
possible to generate "low level" C instead: especially if your C
compiler has a few extra magic tricks up its sleeve. The Melbourne
University implementation of Mercury generates low level C code - we
use our own data representation - the generated C manipulates "Word"s
and pointers to them, uses gotos and if-then to implement flow control,
and so on. Sometimes while I was debugging the code-generator I found
it easier to read the assembly generated by the C compiler rather than
the C code itself because of all the '{'s, '}'s and casts to keep the
C compiler happy.

To get maximum gain from this approach, some support is needed from the
C compiler. We use a gcc (GNU C) extension that allow us to explicitly
use some of the machine's registers, and another extension that lets us
take the addresses of labels and branch between functions (in this scheme
functions are just wrappers for the generated code). We have made quite
an effort to hide all the uses of such extensions in macros so that we
can provide portable replacements on machines which don't support such

Obviously, to compile to "low level" C is more like targeting assembly,
so it looses some of the benefits of using C as a target language -
data structures, function calls/stack management, register allocation,
etc, but it does allow your language implementation to yield higher
performance, since your compiler can make more effective use of the
semantics of the source language.

ps If you are interested in how we use C as a target lanugage take a
look at the Mercury Homepage where you'll find more info:
Thomas Conway conway@cs.mu.oz.au

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.