Newsgroups: | comp.compilers |
From: | chase@centerline.com (David Chase) |
Keywords: | design, optimize, parallel, standards |
Organization: | CenterLine Software |
References: | 95-07-068 95-08-114 |
Date: | Mon, 21 Aug 1995 14:58:23 GMT |
> graham.matthews@pell.anu.edu.au writes:
> chase@centerline.com (David Chase) writes:
> >Second, (once again) where's your measurements? How much potential
> >parallelism are we giving up here, relative to what we've got or
> >will get? How much current unreliability should we trade off
> >against some mythical future parallel performance?
graham.matthews@pell.anu.edu.au writes:
> You are potentially giving up a huge amount of parallelism. Take the
> call,
>
> x = f(g(...), h(...))
What language do you think it is that you are programming in? You've
already lost that huge amount of parallelism, so there's no point messing
up the language in order to retain it -- it's gone. Consider this example:
Here's a program fragment in C, with the property that the sum of the elements
of "glob" is always zero:
static struct s {int x,y,z;} glob = {0,0,0};
g() {
struct s loc = glob;
loc.x+=2; loc.y--; loc.z--;
glob = loc;
return 0;
}
h() {
struct s loc = glob;
loc.x--; loc.y+=2; loc.z--;
glob = loc;
return 0;
}
f(int dummy1, int dummy2) {
struct s loc = glob;
loc.x--; loc.y--; loc.z+=2;
glob = loc;
return loc.x + loc.y + loc.z;
}
No matter what the SERIAL evaluation order of parameters to "f" is, the
property is maintained. As soon as you start evaluating function calls in
parallel, you have to deal with the non-atomicity of structure loads and
stores. You can lose the property that the sum of the elements is zero.
You can, of course, fix this by locking all structure accesses, but the
exercise of associating a lock with each arbitrary structure (without
messing up memory layout) is non-trivial, and locking on a multiprocessor
tends to be a wee bit less efficient than simple assignment.
So, basically, such parallelism is already not practical, so if I
prescribe an order of evaluation, I am not giving up a huge amount of
parallelism. If I'm programming in C or C++, I never had it.
Or, if you argue that with sufficient analysis, structure-operation-free
regions of code can be identified, I can point out that with sufficient
analysis, non-interfering functions can be made apparent to the compiler,
in which case you CAN evaluate them in parallel (remember that the
optimizer is only held to "as if" compliance -- it doesn't actually have
to do things in the prescribed order, if it can prove that you cannot tell
the difference without a debugger).
speaking for myself,
David Chase
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.