Related articles |
---|
thread static chris@tkna.com (1995-08-08) |
Re: thread static bill@amber.ssd.hcsc.com (1995-08-15) |
Re: thread static mac@yukon.asd.sgi.com (1995-08-18) |
Re: thread static stefan.monnier@epfl.ch (Stefan Monnier) (1995-08-21) |
Re: thread static pardo@cs.washington.edu (1995-08-21) |
Re: thread static Roger@natron.demon.co.uk (Roger Barnett) (1995-08-21) |
Re: thread static pardo@cs.washington.edu (1995-08-21) |
Re: thread static mfinney@inmind.com (1995-08-22) |
Re: thread static erik@kroete2.freinet.de (1995-08-22) |
Re: thread static mercier@cinenet.net (1995-08-24) |
[3 later articles] |
Newsgroups: | comp.compilers |
From: | Stefan Monnier <stefan.monnier@epfl.ch> |
Keywords: | parallel, C, comment |
Organization: | Ecole Polytechnique Federale de Lausanne |
References: | 95-08-078 95-08-128 |
Date: | Mon, 21 Aug 1995 08:31:19 GMT |
In article 95-08-128,
Michael McNamara <mac@verilog.com, mac@giraffe.asd.sgi.com> wrote:
] I recognize that one can dedicate a register to hold one's
] thread number, thus avoid the os call; but then consider the cost of
] removing a register from register allocator's pool.
Registers are not *that* scarce !
] Moreover, one still incurs the cost of the array index to get
] one's own foo, and potentially the false cache line sharing problem
] if one packs the array of thread local data in a data major order,
] rather than a thread major order.
Oh, come on ! You wouldn't have a thread number in your register, but
a thread-object pointer with all the thread-local objects part of the
thread-object. So you don't need your array (I hate arrays, cause I'm
fear you might set an arbitrary limit on the number of threads just
so that you can statically allocate your array). Also don't forget
that the extra register doesn't have to be always used: it'd just be
an additional parameter to the main function of the thread and would
only be transmitted to the functions that need it.
And don't forget a few details with your scheme:
- taking the address of a threadlocal variable has to ba done
carefully since this address cannot be passed to another thread
(well it can, but it points to the other thread's variable. Very
subtle bugs expected !)
- thread creation/destruction and context-switches have to go through the
kernel: this imposes a minimum weight to your threads. But that's OK since
most treads packages need to go into the kernel in order to setup the
extensible stack.
- going through the kernel is one thing, but your scheme requires to
also change the pagetable at every context-switch (and thread
creation, etc...). This can be expensive, especially if it requires
some cache flushes. These threads are looking real fat !
I'm not saying fat threads are bad, but your neat trick can make your
system slower than one using an additional register that points to
threadlocals, depending on the grain of the parallelism.
Tradeoffs...tradeoffs...
Stefan
[It's certainly true that if all threads share the same address
space, it's possible to switch threads without a kernel context
switch, which can be a performance boon in some cases. -John]
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.