# On Intel processors (and on some others, I'm sure), why in heaven's
# name would you use the hardware stack for return addresses,
# arguments, and locals? It makes no sense to me. Maybe it is because
# I'm a Forth

As long as the you always know the size any returned value on the
stack, the result stack can be combined with the protocol stack. Forth
allows a function to return a variable number of words on the stack,
needing the distinction between return and protocol stack. Languages
like Ada or Algol68 that allow a variable sized array to be returned
either use two stack or return the array in the heap.

# It spends 3 instructions (all of which stall the pipeline) to
# initialize, and 2 (very slow) instructions to de-initialize.
# -fomit-frame-pointer does marginally better:

It's a well known problem that stacks are more difficult to pipeline
because the stack top becomes a resource contention. That's part of
why zero address machines like the old Burroughs or HP 3000 have given
way to two or three address machines.

SM Ryan

