Re: Info needed (Sparc C compiler w/o register window)

torek@horse.ee.lbl.gov (Chris Torek)
Wed, 15 Apr 1992 16:58:55 GMT

          From comp.compilers

Related articles
Info needed (Sparc C compiler w/o register window) clim@ICSI.Berkeley.EDU (1992-04-14)
Re: Info needed (Sparc C compiler w/o register window) torek@horse.ee.lbl.gov (1992-04-15)
Re: Info needed (Sparc C compiler w/o register window) pardo@cs.washington.edu (1992-04-16)
| List of all articles for this month |
Newsgroups: comp.compilers,comp.lang.c,gnu.gcc.help
From: torek@horse.ee.lbl.gov (Chris Torek)
Keywords: registers, sparc
Organization: Lawrence Berkeley Laboratory, Berkeley
References: 92-04-058
Date: Wed, 15 Apr 1992 16:58:55 GMT

>[I'm no fan of register windows, but is it really that hard to handle the
>over- and underflow? -John]


On the SPARCstation, the answer is `yes'. The reason is a bit
complicated, but it basically boils down to the following:


    - window over/under-flow save/restore must be done with traps
        disabled;
    - when traps are disabled, you must not cause a machine fault;
    - Sun's runtime architecture requires that the O/S allow user code
        to select the address of each register window's memory backing
        store.


The first point is, for `save' at least, due to architecture constraints.
(The argument applies to restore as well, but a bit less obviously. If
you want a handwaving description, think about the fact that the windows
are circular and a `restore' acts like an `add' using the current window
as the source and the restored window as the destination.)


If you enable traps, you can get another trap. A trap automatically
decrements the current window pointer (mod nwindows). If all of the
windows are full, this will (obviously) clobber a full window and things
are hopeless.


Now, you might try to avoid that by using the window invalid mask register
to guarantee *two* trap windows, rather than one. But one hardware
interrupt---memory error---is nonmaskable, so even enabling traps but
blocking interrupts is not sufficient to guarantee the need for only a
single `extra' trap window. You could try to get around this by reserving
*three* trap windows, but things are getting way out of hand now.


Obviously, the thing to do is to save the register contents directly in
the trap handler. But there is a problem here. Each window is to be
stored at that window's %sp (%o6), i.e., the in and local registers must
wind up on the user stack at [%sp+0]..[%sp+63]. But what if the user
stack is paged out, or the user has clobbered the stack pointer? In
either case, you could get a memory fault when saving the window. If you
get a fault while traps are disabled, the machine halts. This is not
good.


Another approach would be to save the register windows in a kernel-
allocated arena. Since this would be under control of the kernel, you
could arrange in advance for the memory to be present and paged-in. Thus,
you could handle a window overflow fault as:


    - dump one or all windows to kernel arena;
    - now that all windows are clean, call kernel to copy them to user space.


The second step can fault safely.


Unfortunately, this approach will obviously be too slow. The
SPARCstations have write-through caches, not write-back, and it takes
about forever to copy an extra 64 bytes this way. In addition, the
overhead for getting in and out of the kernel is significant.


The approach we use in SPARC BSD is a compromise:


if (user stack is valid and paged in) {
write registers to user stack;
return from trap;
}
write registers to dedicated kernel arena;
call kernel to copy to user stack or kill process,
as appropriate.


In my kernel, the `if-true' case writes a single window directly to its
ultimate destination. The `if-false' case, which calls the kernel, writes
all the windows, so that the kernel is never faced with `scattered'
windows. (Suppose, e.g., that the user stack in windows 1, 3, and 4 was
valid, but 0 and 2 were paged out and 5 had a bad user stack pointer. If
we wrote window 0 to the dedicated area, then wrote 1 to the user stack,
then 2 to the dedicated area---remember that the kernel can use up
register windows while getting ready to do the copy---things would get
very confusing. So we just say that, if we have to pay the cost of
getting into the kernel, we might as well let the kernel do valid window
regions too. This turns out to be trivial anyway, due to the design of
the rest of the kernel.)


Now, as it happens, the test `user stack is valid and paged in' is fairly
complicated, and (unfortunately) has to appear multiple times in critical
assembly-language kernel sections. For efficiency, some of them should be
different. All told, the system winds up spending around 100 cycles doing
a single save, and that in the *best* case.


Note that, if the kernel did not use register windows, and if the hardware
did not force the use of them, we could code the kernel without windows
and possibly avoid the mess. It might still be possible to `undo' the
hardware's use of register windows in the trap handlers and thus
`simulate' this; it might even be worthwhile. This is a topic for later
study (right now we need to get the port done!).
--
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 510 486 5427)
Berkeley, CA Domain: torek@ee.lbl.gov


--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.