Re: Pros and cons of high-level intermediate languages

boehm@parc.xerox.com (Hans Boehm)
Thu, 30 Jul 1992 01:02:05 GMT

          From comp.compilers

Related articles
[15 earlier articles]
Re: Pros and cons of high-level intermediate languages jfc@athena.mit.edu (1992-07-25)
Re: Pros and cons of high-level intermediate languages scott@bbx.basis.com (1992-07-25)
Re: Pros and cons of high-level intermediate languages sewardj@cs.man.ac.uk (1992-07-26)
Re: Pros and cons of high-level intermediate languages ridoux@irisa.fr (1992-07-27)
Re: Pros and cons of high-level intermediate languages gat@forsight.jpl.nasa.gov (1992-07-29)
Re: Pros and cons of high-level intermediate languages moss@cs.umass.edu (1992-07-30)
Re: Pros and cons of high-level intermediate languages boehm@parc.xerox.com (1992-07-30)
Re: Pros and cons of high-level intermediate languages graham@maths.su.oz.au (1992-08-02)
Re: Pros and cons of high-level intermediate languages ridoux@irisa.fr (1992-08-04)
Re: Pros and cons of high-level intermediate languages kanze@us-es.sel.de (1992-08-04)
Re: Pros and cons of high-level intermediate languages boehm@parc.xerox.com (1992-08-03)
Re: Pros and cons of high-level intermediate languages rjbodkin@theory.lcs.mit.edu (Ronald Bodkin) (1992-08-04)
Re: Pros and cons of high-level intermediate languages optima!kwalker@cs.arizona.edu (1992-08-04)
[6 later articles]
| List of all articles for this month |
Newsgroups: comp.compilers
From: boehm@parc.xerox.com (Hans Boehm)
Organization: Xerox PARC
Date: Thu, 30 Jul 1992 01:02:05 GMT
References: 92-07-064 92-07-089
Keywords: translator, design

graham@maths.su.oz.au (Graham Matthews) writes:


>(Hans Boehm) writes:
>>But the C standard does not guarantee that C compiler
>>optimizations are safe in the presence of such a collector.


>I am not sure I understand what you are talking about here Hans. As far
>as I can see if the C code you produce is not safe in the presence of
>garbage collection then you are generating incorrect C code. There is
>nothing in C that makes it "garbage collection unsafe" even with a
>non-conservative garbage collection.


Here's a quick example of the problem. Assume a conservative garbage
collector that finds automatic variables by scanning the stack a word at a
time. (Also assume that registers are found somehow, usually either by
using _setjmp, or with assembly code.) Assume further that I write a loop
traversing a list made up of nodes:


struct a {
char junk[40000];
struct a * next;
}


Assume further that this loop is the last access to any of the nodes in
the list, and that x is the last pointer to anywhere in the list.
Somewhere in the loop is the statement


x = x -> next;


On machines like the RS/6000 that allow 16 bit signed displacements in a
load instruction, this is likely to be compiled as:


x += 65536; (An "add immediate upper" instruction)
x = x[-25536]; (This is intended as a byte displacement; not correct C)


If a garbage collection occurs between the two instructions, I'm sunk.
(This can happen in the presence of signals or preemptive threads. If I
have neither, pretend there was an allocation call before this statement,
and the compiler moved the first statement to before the function call.
If x is a local register variable, this is otherwise legitimate. There
are other examples where it's easy to insert a function call at the
inopportune point.)


Note that this is easily remedied by using a different register for the
intermediate value.


The problem is that the C compiler is allowed to disguise local pointer
variables any way it wants to, in the interest of optimization. As far as
it's concerned, the garbage collector is not a legitimate since it
examine's its callers' stack frames.


There are ways to avoid this by explicitly recording values of local
variables to be traced by the collector someplace else or by forcing the
variables themselves into memory. (Edelson and others have done work
along these lines.) But on modern machines, inhibiting register
allocation of local variables is a major performance problem.


The problem can be fixed in current compilers of which I am aware by
judicious insertion of code to keep variables live sufficiently long. If
LIVE(x) forces x to be live (in some cheap, probably compiler dependent
way), then


x = x -> next;


should be rewritten as (automatically one hopes)


x = x -> next; LIVE(x);


The reader can verify that this indeed works for the above example, under
some reasonable assumptions about the compiler. It does not suffice for
an arbitrary ANSI compiler. (Again, more details by email on request.)


Hans-J. Boehm
(boehm@parc.xerox.com)
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.