|Scheduling for RISCs: more on List-scheduling firstname.lastname@example.org.EDU (Mayan Moudgill) (1993-05-19)|
|Re: List scheduling and register allocation email@example.com (1993-05-22)|
|From:||firstname.lastname@example.org (+Sanjay Krishnamurthy)|
|Date:||Sat, 22 May 1993 20:39:27 GMT|
Mayan (email@example.com) writes:
>On most architectures, the compiler has to satisfy 3 constraints when
> -- Registers
> -- Functional Units/Issue Latency
> -- Issue Slots
>On a typical RISC architecture with 32 registers, and single cycle (or
>low) latency functional units, the main constraint becomes the issue slot.
Every compiler I have used has had pretty much the same strategy dealing
with a finite set of registers/functionals units/issue slots DURING THE
SCHEDULING STAGE. Maximal instructon level parallelism is exposed and this
is mapped to a finite resource set. In this scenario, dealing with issue
slots is the simplest problem. It is balancing memory and non-memory
operations together with dealing with registers that contribute largely to
the pain factor.
>I would be _extremely_ interested to find any pointers to scheduling papers
>in the period 79-92 inclusive that do NOT use list scheduling.
I am aware of some work related to "operation scheduling" that differs
from list scheduling. In this scheduling scheme, the highest priority
operation is selected and a slot is found where all resource constraints
are satisfied and the operation is scheduled in this slot. You should
refer to past issues of the MICRO conference proceedings for references to
operation scheduling. Look for Chris Papachristou's work in this area.
BTW, by the "cydra compiler's scheduler", do you mean "modulo
scheduling"?? Why is it different from list scheduling-where the main goal
is "schedule ASAP". Modulo scheduling does precisely that-additional
constraints are imposed because modulo resource constraints have to be
satisfied. But the goal is still the same.
>The best approach to basic-block scheduling is to optimize
>for register pressure, minimizing spills, and then schedule operations so
>that every issue cycle contains at least one operation.
Summarizing the work by Bradlee/Eggers/Henry in ASPLOS-IV, there are two
approaches to integrated scheduling/allocation:
 IPS (Integrated prepass scheduling)-The scheduler is made aware of
register constraints. When there is no register pressure, it schedules
mainly to remove interlocks. But when available registers drops below a
threshold, it schedules to free as many registers as possible.
 RASE: The graph coloring allocator is made aware of the impact
allocation has on scheduling. The scheduler passes information about the
schedule length as a function of the number of registers for each basic
block. The allocator uses this during global allocation(dummy nodes are
inserted into the interference graph and, ummm, and....-hell, read the
Their conclusion was RASE (I think it means register allocation with
schedule estimates) was NOT worth the expense of gathering this extra
info. Both these approaches lack the power of identifying important
sections of code-so that you could use all available registers in these
sections and less important sections are penalized for it. The integrated
scheduler in the Multiflow compiler (Ruttenberg/Freudenberger, Workshop on
Compilers-Berlin) has this property but lacks the power to identify good
spill candidates/spill positions if registers are unavailable. I am a
firm believer in register availability driving all forms of code motion in
the backend-especially global scheduling schemes.
RISC Compilers, Cray Research
Return to the
Search the comp.compilers archives again.