From: | preston@tera.com (Preston Briggs) |
Newsgroups: | comp.compilers,comp.arch |
Date: | 16 Apr 1996 22:19:58 -0400 |
Organization: | /etc/organization |
References: | 96-04-059 96-04-068 96-04-083 |
Keywords: | architecture |
>hmmm...could you explain why you think the Culler-7 was a
>special-purpose machine?
Just rumors I've heard about the machine. Didn't mean to spread
misinformation. Thanks for talking a bit about it.
>I'm afraid I'm not grasping the connection between microcode and
>I-caches that you're suggesting (at least in this context); could you
>expand on that?
This is just some of the old-time religion behind RISC. In the old
days, the CISC machines had a microcode interpreter that ran
microinstructions stored in the microstore. A single
microinstruction, fetched from the fast microstore, could be executed
much more quickly than a machine instruction could be simulated (the
microinstructions were also wide, so the many little functional units
could be controlled at once).
Seeing the difference between microcode execution speed and the speed
at which the machine's "native" instruction set was emulated, people
started making use of programmable microcode. Made their own
special-purpose machines with custom instruction sets (e.g., the
Dorado) or compiled to it directly (e.g., Patterson's early work and
apparently the Culler).
When the 801 came along, they said: no more microcode! Instead, have
an instruction cache and make the basic machine cycle as fast as the
old microcode engines. If you want to do something complex (like
division), call a small subroutine that'll probably live in the cache
(if it's used often enough, or expand the routine inline) and it'll
take about the same number of cycles as the old microcode version of
division did.
As a bonus, you expose the guts of these complex instructions to the
optimizer, making new opportunities for CSE elimination, constant
folding, etc.
And that's the whole deal. With instruction caches, you can make the
hardwired instruction cycle as fast as microcode. There's no reason
to have a separate level of instruction interpretation.
Preston Briggs
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.