18 Aug 1999 03:40:34 -0400

Related articles |
---|

Side effects in expressions mikov@usa.net (Tzvetan Mikov) (1999-08-18) |

Re: Side effects in expressions johnmce@world.std.com (1999-08-18) |

Re: Side effects in expressions johnmce@world.std.com (1999-08-21) |

From: | "Tzvetan Mikov" <mikov@usa.net> |

Newsgroups: | comp.compilers |

Date: | 18 Aug 1999 03:40:34 -0400 |

Organization: | Posted via Supernews, http://www.supernews.com |

Keywords: | optimize, comment |

Hi,

I have been reading Robert Morgan's book "Building an Optimizing

Compiler". I like it a lot - in fact it is one of the best books on

the subject that I have met.

Among other things he describes a method for code generation in which

each expression is always calculated into the same temporary

value. The expressions are regarded as pure functions of their

arguments. This has some obvious advantages, but it is still not clear

are to me how expressions with side effects must be handled. I hope of

some of "comp.compilers" readers can clear the matter for me.

Here is a simple example of code generation, before we delve into the

actual problem. Compiling the following piece of C code (all variables

are locals):

a = (b + c) * (b + c)

gives us:

add T1 = b, c

add T1 = b, c

mul T2 = T1, T1

mov a = T2

(I hope the convention for the generated code is clear). The following

temporaries are always used to calculate the expressions:

T1 = b + c

T2 = T1 * T1 = (b+c)*(b+c)

So, this expression is calculated correctly and we can even easily apply a

very simple common sub-expression elimination. What happens however if we

have sub-expressions with side effects? Examine this C code:

int * p = ...;

x = *(p = p + 1) + *(p = p + 1);

The temporary table has these entries:

T1 = p + sizeof(int)

T2 = value(*T1)

T3 = T2 + T2

and the actual code is:

add T1 = p, sizeof(int)

mov p = T1

load T2 = [T1]

add T1 = p, sizeof(int)

mov p = T1

load T2 = [T1]

add T3 = T2,T2

mov x = T3

Of course, later it will be optimized to:

add p = p, 2*sizeof(int)

load T2=[p]

shl x = T2,1

I realize that with regard to ANSI C dynamic semantics this is a correct

result, but it doesn't seem very logical. A similar problem appears when

calling functions:

a = f() + f();

The temporary table is:

T1 = f()

T2 = T1 + T1

And the code will be:

call T1 = f()

call T1 = f()

add T2 = T1, T1

mov a = T2

This is definitely incorrect. It seems that the convention of using

the same temporaries for the same expressions must be honored only if

the expressions do not include any side effects (assignments or

function calls). This will definitely complicate the simple method of

code generation that is recommended in the book and it wasn't

mentioned at all. I assume that there is an error in my

interpretation, but I can't find it. Any help will be appreciated.

regards,

Tzvetan

[Welcome to the wonderful world of Fortran, which defines its

expressions such that it can combine any two subexpressions that are

lexically the same. Fortran functions are supposed to be true

functions, with the results depending only on the arguments. If you

have Fortran procedures with internal state, code them as subroutines.

In languages other than Fortran, you correctly note that it's wrong to

combine expressions that might have different side effects. -John]

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.