Re: Compiling records by treating them as collections of variables

cwilson@Xenon.Stanford.EDU (Christopher S. Wilson)
23 Feb 1996 00:09:45 -0500

          From comp.compilers

Related articles
Compiling records by treating them as collections of variables pk@cs.tut.fi (1996-02-21)
Re: Compiling records by treating them as collections of variables cwilson@Xenon.Stanford.EDU (1996-02-23)
Compiling records by treating them as collections of variables dave@occl-cam.demon.co.uk (Dave Lloyd) (1996-02-23)
| List of all articles for this month |

From: cwilson@Xenon.Stanford.EDU (Christopher S. Wilson)
Newsgroups: comp.compilers
Date: 23 Feb 1996 00:09:45 -0500
Organization: Computer Science Department, Stanford University.
References: 96-02-249
Keywords: code

Kellom{ki Pertti <pk@cs.tut.fi> wrote:
>[in a course project we're compiling a Pascal-like language to an
>idealized machine with an unlimited number of registers. Record
>structure is touch to handle this way]
>
>Does anyone know of a compiler, which would treat records simply as
>syntactic sugar for a collection of variables?


The SUIF IR (http://suif.stanford.edu) has the ability to represent
something related to what you are suggesting.


In SUIF there is a single representation, called a ``var_sym'' for
both ``virtual registers'' and data objects that would normally reside
in memory and can have their addresses taken.


Each var_sym can have a hierarchy of ``sub-variables'' under it; these
are other var_syms that overlay parts of the parent variable at known
offsets. So for the example you gave, you could have


        var_sym a of type foo
        var_sym a_x of type real at offset 0 in a
        var_sym a_i of type integer at offset 32 in a
        var_sym b of type foo
        var_sym b_x of type real at offset 0 in b
        var_sym b_i of type integer at offset 32 in b


Then your ``a.x := 0'' could be represented by a single load constant
of zero with a_x as the result operand and your ``a := b'' could be
represented as a simple copy instruction from one var_sym to another.


This allows you to see the parts of a composite type as individual
var_syms even when there is the possibility that the address of the
whole may be taken or for some other reason the whole may be needed,
without all the mess of loads, stores and offsets. If you do some
analysis and figure out that the whole really needn't be kept
together, you can just throw away the top-level var_sym and leave the
former child var_syms on their own -- the SUIF system has code to do
that analysis and splitting of composites, in fact. There is also
code to get rid of sub-variables, replacing uses of them with loads
and stores with the proper address and offset, so any compiler code
that doesn't want to see sub-variable form (such as a back-end)
doesn't have to.


The cost of this approach is that compiler code either has to be
written with the awareness that variables might overlap, and
appropriate checks for such cases inserted in that code if needed, or
else the conversion out of sub-variable form must be run before that
code is used. My experience has been that writing compiler code with
the awareness of possible sub-variables is not much of a burden.


The motivation for adding this sub-variable system to the SUIF IR was
to deal with Fortran common blocks and equivalences -- there it really
matters. We haven't found much of a need to use sub-variables with C
code coming into SUIF; in the C code I've seen, structures or unions
are much more likely to be used with pointers or arrays and you don't
get that much out of converting them to sub-variable form. In code
that is a performance bottleneck and where there isn't something funny
going on with pointers, C programmers won't use structures.


The mechanism can be used, though, to easily specify relative memory
layouts of independent variables from within the compiler by making
them all sub-variables of a new parent variable. It can also be a
convenient intermediate step in breaking up an array into separate
scalar variables: first look for loads at constant offsets into arrays
and replace all such with uses of new sub-variables; if you find the
original array is no longer used, get rid of it and the sub-variables
become independent scalars. Both of these are uses of the
sub-variable mechanism that are actually used in the SUIF system and
that were added in response to the needs of real programs.


                --Chris Wilson <cwilson@cs.Stanford.EDU>
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.