Dataflow and function calls

Yuri Gribov <tetra2005@googlemail.com>
Thu, 23 Jul 2009 10:01:43 +0400

From comp.compilers

Related articles
*Dataflow and function calls tetra2005@googlemail.com (Yuri Gribov)* (2009-07-23)**
Re: Dataflow and function calls gopi.onthemove@gmail.com (gopi) (2009-07-28)
Re: Dataflow and function calls tetra2005@googlemail.com (Yuri Gribov) (2009-07-29)
Re: Dataflow and function calls kamalpr@gmail.com (kamal) (2009-08-01)

| List of all articles for this month |

From:	Yuri Gribov <tetra2005@googlemail.com>
Newsgroups:	comp.compilers
Date:	Thu, 23 Jul 2009 10:01:43 +0400
Organization:	Compilers Central
Keywords:	optimize
Posted-Date:	24 Jul 2009 18:29:32 EDT

Dear all,

I am somewhat new to writing compilers so probably this is a very
basic question. I tried to find answers in textbooks but failed.

Some preliminary info: we are developing a home C compiler (not very
complicated but with full set of scalar optimizations). It is already
generating working code so we moved on to optimization phase (global
reg allocation, loop invariants, etc.).

Now comes the question. Imagine that I want to calculate reach-defs
via standard dataflow techniques (we use simple three-address code
without SSA). How should I handle call instructions? Should call kill
all definitions or keep them?

If I choose the first variant I run into (just one of the examples)
problem with loop invariants:
//x, y and z are locals
for(...) {
    ...
    x = 1;
    ...
    f();
    x = y + z; //x will be incorrectly moved because there are no reaching defs
}

In second case I break (again that is just an example) const propagation:
int a = 0, x;
f();
x = a; //a will be (incorrectly) changed to 0 because reached by
single const definition

Currently I use an awkward approach which I do not like (it is very
non-intuitive and looks like a hack). Firstly I calculate liveness
info considering that call does not kill anything (second case). Then
before every call I insert store and after every call - load for every
variable which lives after the call (BTW I also do this for ambiguous
assignments *p = 1) and recalculate liveness/reach-defs. There are
several problems with this approach:
1) it introduces non-trivial dependencies among compiler phases; e.g.
if liveness changes (due to some optimization) I need to find all
loads/stores, remove them and create new); firstly I arranged phases
in specific order but it becomes harder and harder;
2) the generated code may be quite inefficient; e.g.
//x is defined here
for(...) {
    f();
}
//x is used here

will be transformed to

for(...) {
    st x;
    f();
    ld x;
}

but clearly I need

st x;
for(...) {
    f();
}
ld x;

Finally my questions are:
0) How do you generally handle call instructions when doing dataflow
analysis?
1) Is my approach with explicit loads/stores correct? If yes - how do
I place them to avoid inefficient code above?
2) Why is this important topic (calls/ambiguous commands + dataflow,
optimal placing of loads/stores) not covered in compiler textbooks (I
did not find it neither in dragon-book nor in draft version of
"Engineering a compiler")?
3) Could you recommend a textbook or online course which covers such
deep but important details?

Any hints will be appreciated!

--
s UWAVENIEM,
`RIJ

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Dataflow and function calls

Yuri Gribov <tetra2005@googlemail.com>Thu, 23 Jul 2009 10:01:43 +0400

Yuri Gribov <tetra2005@googlemail.com>
Thu, 23 Jul 2009 10:01:43 +0400