Related articles |
---|
Parallelizing C/C++ code rfonteboa@gmail.com (Raphael Fonte Boa) (2010-04-29) |
Re: Parallelizing C/C++ code kym@sdf.lonestar.org (russell kym horsell) (2010-05-01) |
Re: Parallelizing C/C++ code rfonteboa@gmail.com (Raphael Fonte Boa) (2010-05-03) |
Re: Parallelizing C/C++ code kym@sdf.lonestar.org (russell kym horsell) (2010-05-06) |
Re: Parallelizing C/C++ code gneuner2@comcast.net (George Neuner) (2010-05-07) |
Re: Parallelizing C/C++ code joe@burgershack.org (Randy Crawford) (2010-05-14) |
Re: Parallelizing C/C++ code kamalpr@gmail.com (kamal) (2010-05-27) |
From: | russell kym horsell <kym@sdf.lonestar.org> |
Newsgroups: | comp.compilers |
Date: | Thu, 6 May 2010 03:50:44 +0000 (UTC) |
Organization: | Netfront http://www.netfront.net/ |
References: | 10-04-071 10-05-002 10-05-022 |
Keywords: | parallel, analysis |
Posted-Date: | 09 May 2010 12:16:58 EDT |
Raphael Fonte Boa <rfonteboa@gmail.com> wrote:
> On May 1, 7:00 am, russell kym horsell <k...@sdf.lonestar.org> wrote:
> > According to google there are 4.4 million hits for "parallelizing
> > compiler". :) So you know it's a huge area. Perhaps start with
> > wikipedia for an introduction.
> Hi Russel,
> Thanks for googling it for me :)
> Nevertheless, I think the problem for me lies more in the analysis
> area. Are the analysis for parallelism worth the effort for compilers
> technology? Googling for a tool that accomplishes such
> parallelization gives no result. I therefore imagine that it has no
> simple solution.
There is no simple answer to "is is worth it", either. :)
Even gcc does *some* kind of analysis to improve paralelism. Modern
x86 chips typically have several integer and floating pipelines, and
some consideration needs to be given to which ones may be made "too
busy" from a particular instruction sequence. (Even tho the harware
these days also tends to have sophisticated dynamic instr scheduling
anyway).
It's not uncommon for simple C programs to perform an order of
mangitude better with high optimisation settings, or even (the things
I tend to do), a factor of 10 using low optimisation and hand-written
"asm" here and there.
So is an order of magnitude "worth" a few seconds of compile time and
maybe a 2x-10x increase in compiler complexity?
I think the market consensus is "yes".
Of course high-degrees of paralelism are another thing. And perhaps
the chief reason that is left to packages is hitherto there was not
much of a "standard multiprocessor" around. It's quite a trick for a
compiler to handle optimisation for shared memory, messgae passing,
token ring, etc.
But that is about to change with things like GPU-based computing
(aka "ubiquitous paralellism") just starting to take off.
Finally, quite a bit of optimisation can be handled outsid compilers.
One of my sometime intere4sts is partial evaluation and source-to-source
transformation techniques.
In high-performance computing, it's usually worthwhile to just suck-and-see
a large-grained hack of the original s/w, some loop unrolling in selected
routines, etc, just in case you can save 10% of the runtime.
Try some source-to-source transform; if no good stick with scalapack and
MPI or whatever.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.