Re: Number of compiler passes

George Neuner <gneuner2@comcast.net>
Mon, 28 Jul 2008 15:12:03 -0400

From comp.compilers

Related articles
[3 earlier articles]
Re: Number of compiler passes m.helvensteijn@gmail.com (Michiel) (2008-07-22)
Re: Number of compiler passes dwashington@gmx.net (Denis Washington) (2008-07-25)
Re: Number of compiler passes m.helvensteijn@gmail.com (Michiel) (2008-07-25)
Re: Number of compiler passes gneuner2/@/comcast.net (George Neuner) (2008-07-25)
Re: Number of compiler passes m.helvensteijn@gmail.com (Michiel) (2008-07-26)
Re: Number of compiler passes gah@ugcs.caltech.edu (glen herrmannsfeldt) (2008-07-27)
*Re: Number of compiler passes gneuner2@comcast.net (George Neuner)* (2008-07-28)**
Re: Number of compiler passes gneuner2@comcast.net (George Neuner) (2008-07-28)
Re: Number of compiler passes gah@ugcs.caltech.edu (glen herrmannsfeldt) (2008-07-29)
Re: Number of compiler passes gah@ugcs.caltech.edu (glen herrmannsfeldt) (2008-07-29)
Re: Number of compiler passes m.helvensteijn@gmail.com (Michiel) (2008-07-29)
Re: Number of compiler passes m.helvensteijn@gmail.com (Michiel) (2008-07-29)
Re: Number of compiler passes barry.kelly@codegear.com (Barry Kelly) (2008-07-30)
[3 later articles]

| List of all articles for this month |

From:	George Neuner <gneuner2@comcast.net>
Newsgroups:	comp.compilers
Date:	Mon, 28 Jul 2008 15:12:03 -0400
Organization:	Compilers Central
References:	08-07-041 08-07-044 08-07-048 08-07-058 08-07-059
Keywords:	symbols, design
Posted-Date:	29 Jul 2008 19:21:05 EDT

On Sat, 26 Jul 2008 13:34:31 +0200, Michiel <m.helvensteijn@gmail.com>
wrote:

>George Neuner wrote:
>
>>>In the source language, a function can use variables from outside its
>>>scope. More importantly, these variables could be declared after the
>>>function is. This is okay as long as this function is not called
>>>before the declaration of these variables.
>>
>> As long as non-local variables are transitively in the lexical scope
>> chain of the function that uses them you can still gather all the
>> declarations in one pass. If they're not, you have a language that
>> will be error prone and confusing to use.
>
>They are.
>
>I think I understand. You are proposing a breadth first traversal? We are
>using the visitor pattern right now, which lends itself best to a depth
>first traversal.

No, depth first is fine (and I think, preferred). As long as the
compiler can tell a declaration from other expressions in the AST, the
actual traversal order doesn't matter.

>>>... An expression has a
>>>type (bool, int, etc.). But to me it also has an access type. This is
>>>either readonly, writeonly or readwrite. It basically specifies whether it
>>>is an r-value, an l-value or both.
>>
>> I believe you are overthinking the problem.
>>
>> L-value or R-value is a matter of usage, not type.
>
>I've seen the terms used in both contexts. As in: A variable is an l-value.
>A constant is an r-value. I believe the Dragon book does this.

I've also seen this usage, but IMO it can be confusing and I don't
like it. The problem is the abstraction level (see below) and the
fact that the same variable can be an L-value in one expression and an
R-value in another, or both in the same expression.

The practical usage: L-value = result/target, R-value = operand, has
some meaning.

>> Obviously it
>> matters whether a particular variable is in an operand position or in
>> a result position, but that depends on the class of the expression -
>> unary, binary, ternary, etc. - and not on the operand or result types
>> or the operators involved.
>
>Maybe I should clarify. This information decides whether an expression CAN
>be used as an l/r-value or not.
>
>If it cannot be an l-value, it is a readonly expression. You will see this
>often with a simple arithmetic operation or with a constant or literal.
>
>If it cannot be a an r-value, it is a writeonly expression. This is seen
>less often, but can occur for formal parameters of the 'out' direction. (As
>opposed to 'in' or 'inout'.) Or for properties that have a setter method
>but no getter method.
>
>If it can be both, it is a readwrite expression. The access-type of your
>basic variable. But also of some other expressions, like properties and
>array-subscriptings.

Ok, we have different definitions of expression. I consider an
expression to compute something or cause a side effect like altering
control flow. To me, a variable reference like "A[X]" is a
subexpression that can't stand on its own.

If you think about how to express the high level code in a 3-address
form you'll see what I'm saying. The high level expression C
expression "i++" seems to have a read/write variable reference, but in
fact expanded into 3-address code the read/modify/write nature of the
operation becomes clear. Written as "i := i + 1" or "i := inc i", you
clearly see that each reference to the variable can only be read or
write, but never both.

>This information is stored in the symbol table and is used to check for
>referencing errors. It may also help for some optimizations later.

I see what now what you're getting at - it's your terminology that's
throwing me. You want to know whether the variable is ever written to
or only read, but that's different than _being_ an L-value or R-value.

L-value and R-value at the expression level are not synonymous with
"write" and "read" at the symbol level. Take, for example, a
write-only function parameter passed by reference. How do you assign
to that value?

pseudo Ada:

        procedure f( A: out var array of integer ) is
            var i: integer;
        begin
            for i := 0 to 9 do
                A[i] := i;
            end for;
        end f;

At the symbol level, "A" is write-only because it is an "out"
parameter - an attempt to read from it will result in an error. But
it is also a "var" (ie. reference) parameter, so accessing it or
indexing from it requires some computation. Leaving out some
gyrations that Ada for-loops require, this code translates into
(simple) 3-address as something like

                t1 := A
                i := 0
  :loop t2 := i > 9
                if t2 goto end
                *t1 := i
                t1 := t1 + sizeof(integer)
                i := i + 1
                goto loop
  :end

As you can clearly see, the "write-only" variable "A" is used only in
an operand position, ie. as an R-value. It is never used (directly)
as an L-value.

George

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Number of compiler passes

George Neuner <gneuner2@comcast.net>Mon, 28 Jul 2008 15:12:03 -0400

George Neuner <gneuner2@comcast.net>
Mon, 28 Jul 2008 15:12:03 -0400