Re: Parallelizing C/C++ code

russell kym horsell <>
Thu, 6 May 2010 03:50:44 +0000 (UTC)

          From comp.compilers

Related articles
Parallelizing C/C++ code (Raphael Fonte Boa) (2010-04-29)
Re: Parallelizing C/C++ code (russell kym horsell) (2010-05-01)
Re: Parallelizing C/C++ code (Raphael Fonte Boa) (2010-05-03)
Re: Parallelizing C/C++ code (russell kym horsell) (2010-05-06)
Re: Parallelizing C/C++ code (George Neuner) (2010-05-07)
Re: Parallelizing C/C++ code (Randy Crawford) (2010-05-14)
Re: Parallelizing C/C++ code (kamal) (2010-05-27)
| List of all articles for this month |

From: russell kym horsell <>
Newsgroups: comp.compilers
Date: Thu, 6 May 2010 03:50:44 +0000 (UTC)
Organization: Netfront
References: 10-04-071 10-05-002 10-05-022
Keywords: parallel, analysis
Posted-Date: 09 May 2010 12:16:58 EDT

Raphael Fonte Boa <> wrote:
> On May 1, 7:00 am, russell kym horsell <> wrote:
> > According to google there are 4.4 million hits for "parallelizing
> > compiler". :) So you know it's a huge area. Perhaps start with
> > wikipedia for an introduction.

> Hi Russel,
> Thanks for googling it for me :)

> Nevertheless, I think the problem for me lies more in the analysis
> area. Are the analysis for parallelism worth the effort for compilers
> technology? Googling for a tool that accomplishes such
> parallelization gives no result. I therefore imagine that it has no
> simple solution.

There is no simple answer to "is is worth it", either. :)

Even gcc does *some* kind of analysis to improve paralelism. Modern
x86 chips typically have several integer and floating pipelines, and
some consideration needs to be given to which ones may be made "too
busy" from a particular instruction sequence. (Even tho the harware
these days also tends to have sophisticated dynamic instr scheduling

It's not uncommon for simple C programs to perform an order of
mangitude better with high optimisation settings, or even (the things
I tend to do), a factor of 10 using low optimisation and hand-written
"asm" here and there.

So is an order of magnitude "worth" a few seconds of compile time and
maybe a 2x-10x increase in compiler complexity?

I think the market consensus is "yes".

Of course high-degrees of paralelism are another thing. And perhaps
the chief reason that is left to packages is hitherto there was not
much of a "standard multiprocessor" around. It's quite a trick for a
compiler to handle optimisation for shared memory, messgae passing,
token ring, etc.

But that is about to change with things like GPU-based computing
(aka "ubiquitous paralellism") just starting to take off.

Finally, quite a bit of optimisation can be handled outsid compilers.
One of my sometime intere4sts is partial evaluation and source-to-source
transformation techniques.

In high-performance computing, it's usually worthwhile to just suck-and-see
a large-grained hack of the original s/w, some loop unrolling in selected
routines, etc, just in case you can save 10% of the runtime.

Try some source-to-source transform; if no good stick with scalapack and
MPI or whatever.

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.