Re: Cray-2 Fast Memory (James Davies)
Wed, 26 May 1993 20:48:30 GMT

          From comp.compilers

Related articles
Cray-2 Fast Memory (1993-05-13)
Re: Cray-2 Fast Memory (1993-05-14)
Re: Cray-2 Fast Memory (1993-05-26)
Re: Cray-2 Fast Memory (1993-05-26)
Re: Cray-2 Fast Memory (1993-05-27)
Re: Cray-2 Fast Memory (1993-05-31)
| List of all articles for this month |

Newsgroups: comp.compilers,comp.sys.super
From: (James Davies)
Keywords: registers, optimize, Cray
Organization: Cray Computer Corporation
References: 93-05-062 93-05-127
Date: Wed, 26 May 1993 20:48:30 GMT

Patrick Delano <> writes:
> Apparently the Cray-2 had a fast memory that unlike cache memory was
> explicitly managed by the compiler.

This is also true of the Cray-3; each has 16K words of local memory per
processor. This memory is a bit less flexible than common memory, in
that vector loads and stores must be stride-1. (David desJardins) writes:
>Basically, no software techniques were used. The compiler does very
>little to take advantage of the local memory. As far as I am aware, the
>only ways in which it is used are the following:
> o Temporary storage for register spillage.
> o As a means of extracting scalar values from vector registers
> (which can be done directly on the X-MP and Y-MP).
> o When the programmer, by directive, explicitly indicates that a
> variable is to be placed in local rather than common memory.

All true, but it's also used for subroutine linkage information, such as
return addresses and stack pointers (the stack itself is in common
memory). Each routine is allocated a static area for this purpose, which
must be saved and restored for potentially recursive calls.

Even with this limited usage, local memory tends to be in short supply, as
there are only 16K words available per processor. The linker attempts to
overlay local-memory blocks when possible, but there is still a need for a
compiler option to minimize local memory usage by e.g. spilling registers
to common memory.

>I believe that the primary reason that more sophisticated techniques
>were not used by the compiler is that less than 40 Cray-2 machines were
>manufactured and sold, compared to hundreds of X-MP and Y-MP type

Partly, but the compilers for the X's and Y's don't do any of the loop
transformations necessary to do memory management either. Basically they
do inner-loop vectorization and leave the fancy multi-loop optimizations
to a preprocessor. The incentive to use local memory (or to license some
third-party product like KAP to use it) would be greater if there were
more available, but it's hard to justify when you're already squeezed for

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.