|need some insights (dynamic code genration on DSP's) firstname.lastname@example.org (Nils Pipenbrinck) (2007-08-07)|
|Re: need some insights (dynamic code genration on DSP's) email@example.com (Eliot Miranda) (2007-08-09)|
|From:||Nils Pipenbrinck <firstname.lastname@example.org>|
|Date:||Tue, 7 Aug 2007 22:08:37 +0200|
|Keywords:||DSP, code, question|
|Posted-Date:||07 Aug 2007 16:41:56 EDT|
Hi compilers group.
I'm working on a small dynamic code generation project and I need some
help or direction where to look at (essencial books ect).
Some background first:
My goal is to write a dynamic code generator for multi-media
applications running on a DSP (TI C64Plus). Just simple rendering
loops - two times nested will be the most complex thing I'll ever have
to deal with. I did something similar before on a MIPS 64 style CPU
(NEC Vr5500 CPU), and I have decades of experience outperforming
compilers with hand written assebly code.
Just glueing pieces of code together on the MIPS, and running a
register allocator did the job on ordinary CPU's so far.
On the DSP however things are a lot more complicated. I have to
schedule instructions and I have to obey lots of resource
constraints. Just gluing together code sequences does not work
any more. Well - it does work technically, but the performance suffers
so badly that I have to do something else.
I decided to get one abstraction level higher, build a DAG like
datastructure that directly maps onto the DSP instructions and work
downward from that.
At the moment of writing I can compile such a DAG to code as long as
it's entirely sequential code. Performance is much better than I would
have expected for my "first try" scheduling heuristic. I'm very
satisfied with the results so far.
However, I have problems when it comes to loops:
How should I handle things that change in each iteration: Like a
data-pointer that increments in each iteration, or simply the humble
I could add a souce dependency from the loop counter at the start of
the loop to the "updated loop counter", but that in turn would cause my
my "keep it simple and stupid" scheduler from ever scheduling
instructions that depend on the loop counter. The problem boils down
to to the fact that I only schedule something if all its destination
dependencies have been scheduled. My loop counter however is circular,
so it will never meet this requirement.
There must be a simple trick to handle these kinds of circular
dependencies, I could hack my way through that problem but somehow I
feel (coders intuition) that there must be a clean and elegant way to
deal with it.
Btw: I do my scheduling bottom - up, starting with the result (a write
to memory in 99% of all cases) and work myself up through the source
dependency tree. Register allocation runs in a second pass top-down. I
don't expect that I'll ever have to spill registers as I have 64 of
them, and my test-code - which is worst case - rarely has more than 20
It's a simple thing, I just have to get the loops working.
Could you guys shed some light on it or give me some essential reading
Thanks for reading :-)
Nils from Hamburg/Germany
Return to the
Search the comp.compilers archives again.