Re: thread static

pardo@cs.washington.edu (David Keppel)
Mon, 21 Aug 1995 23:22:18 GMT

          From comp.compilers

Related articles
thread static chris@tkna.com (1995-08-08)
Re: thread static bill@amber.ssd.hcsc.com (1995-08-15)
Re: thread static mac@yukon.asd.sgi.com (1995-08-18)
Re: thread static stefan.monnier@epfl.ch (Stefan Monnier) (1995-08-21)
Re: thread static pardo@cs.washington.edu (1995-08-21)
Re: thread static Roger@natron.demon.co.uk (Roger Barnett) (1995-08-21)
Re: thread static pardo@cs.washington.edu (1995-08-21)
Re: thread static mfinney@inmind.com (1995-08-22)
Re: thread static erik@kroete2.freinet.de (1995-08-22)
Re: thread static mercier@cinenet.net (1995-08-24)
Re: thread static meissner@cygnus.com (Michael Meissner) (1995-08-24)
Re: thread static stefan.monnier@epfl.ch (Stefan Monnier) (1995-08-28)
Re: thread static johnr@numega.com (1995-08-28)
| List of all articles for this month |

Newsgroups: comp.compilers
From: pardo@cs.washington.edu (David Keppel)
Keywords: parallel, C
Organization: Computer Science & Engineering, U. of Washington, Seattle
References: 95-08-078 95-08-143
Date: Mon, 21 Aug 1995 23:22:18 GMT

>Michael McNamara <mac@verilog.com, mac@giraffe.asd.sgi.com> wrote:
>>[Dedicating a register to point to the thread hurts register allocation.]


Stefan Monnier <stefan.monnier@epfl.ch> writes:
>Registers are not *that* scarce !


A huge fraction of the machines people use every day are 80x86-family
machines, which have maybe 8 general-purpose 32-bit registers. With
%ESP used as a stack pointer, you're down to seven; you need about two
registers for short-lived temporaries, leaving you with five; some
operations always place their result in a particular register,
complicating register allocation further. Bottom line: registers can
be scarce.


If you can build your system to always use fixed-alignment
power-of-two-aligned stacks, then you can generate a mask such that
alignment = 1<<n, mask=~(1<<n - 1). Then, address `addr = sp & mask'
(where sp==stack pointer register) is always the zeroth word of the
stack (gyrate appropriately if you're worried about stack overflow and
your stacks grow down) and you can store thread context information
starting at address addr. This techinque is used at least in Jeff
Chase's `Amber' system and the Sequent Symmetry implemetnation of
Scheduler Activations. For details see UW CSE tech reports under


http://www.cs.washington.edu/


and/or my `QuickThreads' threads package toolkit under


http://www.cs.washington.edu/homes/pardo/papers.d/thread.html


Using aligned stacks is, unfortunately, both less flexible than using
arbitrary stacks, and also more likely to blow out the cache, since
all stacks that have grown to the same depth will map to the same
location mod 2^n.


>[You're going through the kernel to set up stacks anyway.]


Depending on your threading strategy, you can allocate stackless
threads that are allocated stacks lazily, when they start to run.
If you've got a lot of allocated threads but you can schedule them to
run so that each thread usually finishes before the next starts, then
you can recycle stacks quite effectively. See Engler's paper
referenced from the QuickThreads paper, above.


As an aside, some systems have ~100K *active* threads; if a typical
stack is 4KB and the typical stack size of a thread when it blocks is
400B, then a 10x compaction is possible, at the expense of longer
context switch times. However, 10X memory compaction can be the
difference between fitting in a system and not fitting. See the
QuickThreads paper for a few more details.


;-D oN ( Threadening ) Pardo
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.