From: | anton@mips.complang.tuwien.ac.at (Anton Ertl) |
Newsgroups: | comp.compilers |
Date: | Wed, 11 Feb 2009 16:15:45 GMT |
Organization: | Institut fuer Computersprachen, Technische Universitaet Wien |
References: | 09-02-021 09-02-025 09-02-031 |
Keywords: | assembler, practice |
Posted-Date: | 11 Feb 2009 17:26:11 EST |
"Bartc" <bartc@freeuk.com> writes:
>I've tried targetting C and it was completely unsatisfactory.
>
>First, there are a number of extra hoops to jump through in order to
>generate C source code (syntax, formatting, creating suitable names
>when your language uses namespaces perhaps).
>
>Then, to implement certain features of your language might involve
>using casting and other tricks in the generated C when it's datatypes,
>and using a lot of gotos and labels when it's syntax.
>
>Then you make the discovery that you can use casting and gotos for
>nearly *all* your language constructs, meaning most features of C are
>not needed
Yes, C is a little too high-level for that job, but it's still the
best portable assembly language we have. One can work around most of
these issues as you point out, though.
If your source language is close enough to C, you can even map some of
the source features directly to the corresponding C features (in
particular if these source features are more restricted).
> Functions are still
>needed, but only just.
Actually functions and other stateful control flow are the worst
problem with C as portable assembly language. C offers no usable
lower-level way to do this kind of stuff. If your source language has
stateful control flow that does not map to functions, function calls,
and setjmp()/longjmp(), you will have a hard time implementing the
programming language, and you may lose a lot of performance. Examples
of such features are guaranteed tail calls, exceptions (may be
mappable to setjmp/longjmp), and backtracking.
One way to work around that would be to map all control flow to gotos,
and manage the state explicitly, but this would typically mean
translating all the source code into one C function, which will blow
up most C compilers for larger source programs, and will disable any
separate compilation features the source language has. Also,
simulating indirect jymps (as necessary to implement, e.g., returns)
in standard C is slow.
Other ways are to compile the program into several C functions, and
then top arrange some calling and returning between them to satisfy
C's restrictions even if that would not be necessary in a lower-level
language. E.g., one way to implement tail-calls is to compile every
source function into a C function, have each tail-calling function
return the next function to be called, and have a call loop that calls
these functions one after the other.
>So you end up with a target language which is a travesty of C,
It's certainly not something a human would write as C source code.
But then the assembly output of a compiler is also not something that
a human would write as assembly language source code. Is it a
travesty of assembly language then?
>and
>while it might be portable, won't compile to great code because the
>structure the C compiler depends on is missing.
What kind of "great code" do you have in mind?
The code generation quality an optimizations that I expect from a C
compiler don't depend on a structure; a good C compiler will perform
decent code selection, register allocation, and instruction scheduling
without a particular structure in the source code. That's what I
expect from a portable assembly language.
One other optimization I relied on in my generated C code is copy
propagation/register coalescing, and that also does not need a
particular structure.
An optimizing C compiler will typically also recover, e.g., the loop
structure from code written with gotos, and may then perform loop
optimizations (although that's already not very reliable with
hand-written source code).
>(And there's the headache of converting errors/line numbers in C
>source back to your original code.)
The #line directive helps here, and you don't have anything else when
you compile to assembly language.
If the errors you have in mind are compile-time errors, they should
not happen with something called a compiler; i.e., you should not
generate code in your target language that the target language
processor (whether it is a C compiler or an assembler) rejects.
- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/
Return to the
comp.compilers page.
Search the
comp.compilers archives again.