Re: 8-bit processor specific techniques

BGB <>
Tue, 29 Sep 2015 20:06:26 -0500

          From comp.compilers

Related articles
8-bit processor specific techniques ( (2015-09-27)
Re: 8-bit processor specific techniques (BGB) (2015-09-27)
Re: 8-bit processor specific techniques (Walter Banks) (2015-09-28)
Re: 8-bit processor specific techniques ( (2015-09-29)
Re: 8-bit processor specific techniques (BGB) (2015-09-29)
| List of all articles for this month |

From: BGB <>
Newsgroups: comp.compilers
Date: Tue, 29 Sep 2015 20:06:26 -0500
References: 15-09-019 15-09-023 15-09-027
Keywords: code, architecture
Posted-Date: 29 Sep 2015 21:52:13 EDT

On 9/29/2015 12:55 PM, wrote:
> On Monday, 28 September 2015 20:26:51 UTC+1, Walter Banks wrote:

>> Unfortunately most of this material has never been published and
>> hasn't been the focus of research projects. The techniques used mostly
>> show up in application specific ISA's. This is the type of processor
>> whose applications tend not to be hosted and are small enough that
>> compilers can be exhaustive and often have tight execution
>> requirements.
> All the documented stuff is targetting 32+ bit arches.

if you mean things like LLVM, yes.

in terms of basic techniques, these shouldn't really care all that much
what the CPU word size is.

likewise, not all 8/16-bit targets are equivalent.

older architectures (6502 or Z80 or similar) have tended towards being
rather idiosyncratic with a small number of specialized registers.

newer ones, such as MSP430 or AVR8, tend to have a more regular
instruction set and a larger number of registers (mostly GPRs with a few
special purpose registers).

for example, TAC/SSA would likely be more applicable to an MSP430 or AVR
than it would to something like a Z80 or 6502, but I could be wrong on
this front.

generally, a stack machine model can either mapped fairly directly to a
native instruction set, or internally partially or fully mapped to the
use of registers or temporaries, being able to utilize which registers
exist, and need not map to memory (the value stack could exist entirely
in CPU registers).

where TAC+SSA has an advantage is if you have more machine registers
available than intermediate values in an expression, so it might be
useful to keep things around temporarily (being able to detect and reuse
previously computed values, ...).

TAC (without SSA) is generally better suited for code-generation though
(handled naively, SSA form could risk increasing register pressure and
MOV operations with little or no gain over plain TAC).

TAC has an advantage over RPN in terms of allowing more efficient code
to be generated on common targets with a simplistic code-generator, due
to mapping a little more directly to CPU registers and making it easier
to share patterns.

though, one possibility is a model where operations are performed
between a stack and variables, ex, rather than operations like:
"$z=$x+$y;" or "$x; $y; +; =z;" we have, say, "$x; +y; =z".

RPN could be better though for irregular CISC style targets, or for
recognizing patterns.

TAC or SSA could potentially risk more "noise" which would hinder
finding repeating patterns, as an otherwise equivalent sequence of
instructions may operate on a different set of temporaries. however,
TAC+SSA could still be useful after these patterns are found.

> I intend for the project to have multiple back-ends, using LLVM as one (to
> start and for access to Apple OSes) and a full Ada one. I'm not sure about
> just producing a final binary for 8-bit targets as there will be a system rts
> library for each target and that needs to be linked. Yes, it could be just
> built straight in memory. I'd have to look at it.

I have little idea about LLVM and 8/16 targets, as personally I still
haven't really made much use of LLVM (it hasn't really tended to align
all that well with what I am doing).

I have some stuff that runs on ARM chips, but thus far it is using a
threaded-code interpreter, rather than generating native code.

> [Re linker, the library is just a library of intermediate code. Pull in
> the parts you need, optimize and generate the code for the whole program.


conventional object-file based linking isn't really recommended for
small targets, since linking object files will try to pull in code or
data that isn't needed in the final image (C runtime libraries have
often tried to minimize this by putting each function in its own object
file), and space is at a premium on an 8/16 target. ideally, the
inclusion should be a bit more fine-grained than this.

better is only pulling in individual functions, and then try to omit any
code which is unreachable within those functions or data that isn't used
by the reachable code.

as noted before, it probably makes sense to try to eliminate repeating
patterns, in addition to avoiding any code which isn't used (such as
branches into code which wont actually be executed).

this is partly where LZ would come in. to some extent, it could be
possible to first break things into basic-blocks and potentially use
constant-propagation or similar to detect branches which would never

LZ compression could help detect patterns between functions for which it
may make sense to try to eliminate them, but this may require care in
the design of the IR (to make patterns easier to detect and utilize).

aggressive inlining and constant propagation could potentially also
increase the number of such patterns enough to outweigh the cost such
inlining would otherwise incur (though it could potentially also
backfire and make the output larger).

granted, I could be wrong on all this, as this is outside of areas for
which my existing experience applies.

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.