Re: Optimizing structure layout (Fergus Henderson)
22 Mar 1997 23:31:23 -0500

          From comp.compilers

Related articles
Optimizing structure layout (1997-03-21)
Re: Optimizing structure layout (Julian Dolby) (1997-03-22)
Re: Optimizing structure layout (1997-03-22)
Re: Optimizing structure layout (Christopher Glaeser) (1997-03-22)
Re: Optimizing structure layout (Michael Meissner) (1997-03-27)
Re: Optimizing structure layout (David L Moore) (1997-03-27)
Re: Optimizing structure layout (1997-03-31)
Re: Optimizing structure layout (John Pieper) (1997-03-31)
Re: Optimizing structure layout (Valentin Bonnard) (1997-03-31)
| List of all articles for this month |

From: (Fergus Henderson)
Newsgroups: comp.compilers
Date: 22 Mar 1997 23:31:23 -0500
Organization: Comp Sci, University of Melbourne
References: 97-03-130
Keywords: optimize, storage writes:

>While it is quite common that compilers optimize the code they
>produce, I haven't heard of a commonly used system that really
>optimizes the layout of the data structures that are generated. Are
>there such systems?

The Mercury compiler optimizes data structures in several ways.

The basic means of defining data type in Mercury is with discriminated
unions. For example, a declaration such as

:- type foo
---> bar
; baz(string)
; quux(foo, int).

defines a type with three alternatives. The "unoptimized"
representation of this data structure would be as a pointer to a
(heap-allocated) cell whose first word is a "tag" that indicates which
of the three alternatives this is. The remaining words (if any) hold
the fields for this alternative.

The data structure optimizations that the Mercury compiler uses
are the following:

- if there is only one alternative, then we don't need to
store a tag

- for enumerations, i.e. discriminated unions with no fields,
we just use the tag directly, rather than using a pointer
to a heap-allocated cell containing the tag.

- We keep all our data structures word-aligned, and use the
bottom two or three bits on pointers to hold tags.
If there are more tags than will fit in two or three
bits, we split the tag into a two or three bit primary
tag and a larger secondary tag. We use one primary tag value
for all the alternatives with no fields; for these alternatives,
the remainder of the words is a secondary tag use to distinguish
between them. We allocate as many as possible of the alternatives
a primary tag of their own (here profiling feedback might be useful,
since ideally the most-used alternatives should be allocated
their own primary tags). The remaining alternatives are allocated
the last primary tag, and are distinguished by an indirect
secondary tag stored as the first word of the heap-allocated

- for types with exactly one alternative which has exactly
one field, we just access the field directly, rather than
using a pointer to a heap-allocated cell containing the

The third optimization listed above (using tagged pointers) improved
performance by about 20%.

For a more detailed explanation, see [1].

[1] The execution algorithm of Mercury: an efficient purely
              declarative logic programming language.
              Zoltan Somogyi, Fergus Henderson and Thomas Conway.
              Journal of Logic Programming, volume 29, number 1-3,
              October-December 1996, pages 17-64. Available via

Fergus Henderson <>
WWW: <>
PGP: finger fjh@

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.