Related articles |
---|
Limited Register State Restore nicolas_capens@hotmail.com (2003-12-03) |
From: | nicolas_capens@hotmail.com (c0d1f1ed) |
Newsgroups: | comp.compilers |
Date: | 3 Dec 2003 20:41:20 -0500 |
Organization: | http://groups.google.com |
Keywords: | analysis, optimize |
Posted-Date: | 03 Dec 2003 20:41:20 EST |
Hi,
My question is related to the linear copy-propagation algorithm I
posted here a few weeks ago
(http://groups.google.be/groups?dq=&start=25&hl=nl&lr=&ie=UTF-8&group=comp.compilers&selm=03-11-046%40comp.compilers).
What I'm trying to do now is minimize the number of memory accesses
when jumping.
Previoulsy what I did was simply writing all registers to memory at
every jump and label. But this is obviously inefficient because even
tiny jumps would write all registers in use to memory, even if they
are not modified by the conditional code. And since my application
makes heavy use of MMX and SSE, it can be over 200 bytes of redundant
memory accesses per jump!
The first step of the solution was quite easy, but cumbersome: Before
the jump (or before the label), I store the complete register
allocation table. Then at the label (or the jump) I restore that
state, by writing back all registers that store different variables.
It works very well but it involves a lot of 'manual' work because I
have to do it individually per jump.
The trivial solution would be to store the register allocation state
for -every- jump and label, and determine automatically what state has
to be restored. However, because my code is linearly run-time
generated, I have no 'lookahead' and I don't know when certain states
will not be referenced again. And since the allocation state is quite
a large table this keeps eating memory till the end of my code
generation phase.
I hope my title has become a bit clear now... What I'd like to do is
cache only a limited amount of allocation states in a fixed array. It
is acceptable that for deeply nested conditional code the last used
states get lost and no optimization is done. In other words, it is
most important to optimize the small inner loops.
But there's a problem with this, and I think it's best to illustrate
it with an example:
cond_jump label1
; ...
cond_jump label1
; ...
label1:
At the first jump, I would store the register allocation state and
label1 name. Normally, when I reach the second jump, I would recognise
the label1 and restore the register state, and do the same at the
label1:. This results in consistency for whatever path is taken,
because it always restores the original state.
But now imagine I loose the allocation information after the first
jump, for example because there are other conditional blocks in
between which store their state in our limited array of register
allocation states. What would happen is that the second jump is
interpreted as the first that references label1. So now at label1: we
restore the state as it was at the second jump. This is clearly wrong
because if the first jump is taken we expect to go to a situation
where it's state is preserved, but here it's the second jump's state
that is preserved, which most likely is not equal.
What should have happened is that somehow we write all registers to
memory at the first jump and do the same at the label. The former is
not a big problem, as we can give feedback about what instructions get
written to the final buffer, just like my linear-scan copy-propagation
algorithm. But the latter is not possible, because we can't know the
label has been removed from the array...
So, I wondered if anyone has a lot of experience with run-time code
generation and could tell me how to solve this elegantly?
Yours sincerely,
Nicolas Capens
Return to the
comp.compilers page.
Search the
comp.compilers archives again.