Re: Processor specific optimisations

Dave Hudson <dave@cyclicode.net>
18 Jan 2002 21:04:44 -0500

          From comp.compilers

Related articles
Processor specific optimisations mpointie@eden-studios.fr (MickaŽl Pointier) (2002-01-17)
Re: Processor specific optimisations pfroehli@ics.uci.edu (Peter H. Froehlich) (2002-01-18)
Re: Processor specific optimisations dave@cyclicode.net (Dave Hudson) (2002-01-18)
Re: Processor specific optimisations rickh@capaccess.org (2002-01-18)
Re: Processor specific optimisations usenet@gehre.org (2002-01-18)
Re: Processor specific optimisations walter@bytecraft.com (Walter Banks) (2002-01-24)
Re: Processor specific optimisations jgd@cix.co.uk (2002-01-24)
Re: Processor specific optimisations thp@cs.ucr.edu (2002-01-24)
Re: Processor specific optimisations RLWatkins@CompuServe.Com (R. L. Watkins) (2002-01-24)
[8 later articles]
| List of all articles for this month |

From: Dave Hudson <dave@cyclicode.net>
Newsgroups: comp.compilers
Date: 18 Jan 2002 21:04:44 -0500
Organization: Compilers Central
References: 02-01-077
Keywords: optimize
Posted-Date: 18 Jan 2002 21:04:44 EST

Hi MickaŽl,


MickaŽl Pointier wrote:


> Now, I wonder how it's possible to obtain a good result for processors
> like the "good" old 6502 where most optimisations tricks are based on
> dealing with zero page accessing, wrap around tricks with index
> registers, self modifying code (for RAM based code), alligment of data
> on page boudaries, and so on.




Anything's possible, but it probably requires working out what all of
those tricks were and applying them as additional transformations
within the compiler. As these tricks will probably be very processor
specific it's unlikely that anyone's spent a lot of time doing them
unless there's either a major commercial reason to do so or it's a
labour of love :-)




> I ask this question because for the moment all the C (cross) compiler
> that have been retargeted for the 6502 (CC65, LCC65, Quetzalcoatl,
> ...) produce code that is at best "bad". When I write "bad", I mean
> that I usually manage to recode the same routine with a speed up
> between x4 and x20 :'(




Sadly, most 8-bit compilers I've encountered have produced at best
average and usually quite bad code. I think that part of the problem
with these though is that many of them haven't don't support all of
the higher-level optimizations that you'd find in say gcc.




> So, the question finaly is: Is it possible to apply the modern
> compiler optimisation strategies to this old processor and have a
> result that an experimented 6502 assembly coder would have a hard time
> to beat ?


I would say that the answer is yes, but that it takes a lot of work.
If you look at some of the 8-bit ports of gcc (it's the only one I can
really comment on) it can actually do a pretty respectable job. The
AVR port produces very good code in most instances (I used to reguarly
find that it was almost optimal for some functions) and the 68HC11
port seems to be similarly well regarded.


I've spent much of the last 12 month rewriting a lot of the backend of
the Ubicom IP2022 port (another 8-bit one) and we're now approaching
the stage where it frequently produces code that's as good as we can
write in assembler. In order to really push things though I've had to
add quite a number of new optimization passes to first stitch things
back together that were split because of only having one offsettable
data pointer register and then to progressively chop larger
instruction sequences up in the way that an asm programmer would do.
It means that the machine-dependent-reorg part of our port looks
somewhat large, but it demonstrates that it can be done. FWIW 12
months ago we had a working port based purely on the way most of the
ports for larger processors have been done but the code it generated
was around 2.5x to 3x the size we get now and probably 5x to 6x
slower.


For the record, the IP2022 is an 8-bit accumulator-based processor
(one 8-bit accumulator) and has one offsettable stack pointer, one
offsettable data pointer and one non-offsettable data pointer. All
our "registers" are in fact directly addressed memory locations in the
first 256 bytes of RAM although access to either the direct RAM or
through a pointer is single-cycle and pretty much every opcode takes a
memory argument. This is not your typical 32-bit load/store RISC core
:-)


Regards,
Dave


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.