Re: How to auto-parallelize a binary code?

"cr88192" <>
Fri, 12 Jun 2009 11:57:56 -0700

          From comp.compilers

Related articles
How to auto-parallelize a binary code? (yunzhi) (2009-06-05)
Re: How to auto-parallelize a binary code? (Louis Krupp) (2009-06-06)
Re: How to auto-parallelize a binary code? (2009-06-07)
Re: How to auto-parallelize a binary code? (Jeremy Wright) (2009-06-08)
Re: How to auto-parallelize a binary code? (George Neuner) (2009-06-08)
Re: How to auto-parallelize a binary code? (cr88192) (2009-06-12)
| List of all articles for this month |

From: "cr88192" <>
Newsgroups: comp.compilers
Date: Fri, 12 Jun 2009 11:57:56 -0700
References: 09-06-024 09-06-040
Keywords: parallel
Posted-Date: 14 Jun 2009 19:11:02 EDT

"George Neuner" <> wrote in message
> On Fri, 5 Jun 2009 23:52:35 -0700 (PDT), yunzhi <>
> wrote:
>>We have an important single-threaded application. It was developed
>>several years ago but the source code is gone. ...
>>I do know the single-thread performance will be improved. But how to
>>auto-parallelize this binary code? Are there any tools or related
> As others have said, rewriting the application is your best bet.
> As far as I know there is no current research on auto-parallelizing
> native binaries. Back in the 1980's some groups were looking into it,
> but rapid advances in single core performance killed interest in the
> idea. Virtually all of the tools available now work either at the
> source level or at the AST level in the compiler.
> If you are desperate for better performance, one thing you might look
> into is disassembling the program and re-assembling it to target the
> new processor architecture. Modern processors can do a certain amount
> of hardware instruction reordering to accommodate older code, but if
> you're leaping one or more processor generations, a peephole optimizer
> (sometimes called a "window" optimizer) specifically targeted to the
> new processor can make a big difference for single core performance.

Though unlikely to help much in this case, a while ago I remember
running across a paper which was describing auto-parallelizing native

Their idea was basically to disassemble the code and rework it into
SSA form, then do some "funky magic" on it to split it up into
parallel operations (fine-grained AFAIK), and then recompile the code
for the new target (in this case, specialized processor hardware, I
think the task was to get generic x86 code running on a multi-core
risk processor. the paper also compared their approach with that taken
by Transmeta, ...).

I don't remember that much more than this...

I guess the great difficulty would be to pull this off tolerably efficiently
on a conventional OS and HW (without overhead by far exceeding any possible

agreed, the OP should do whatever possible to get a source-code version of
the app before it is too late, even rewriting the app if necessary.

however, I will not rule out "decompilers", as even if the output generally
sucks, it may be possible to work it into something that can at least be
recompiled (and serve as a starting point for a complete rewrite, said
rewrite being done "one function at a time"...). I guess whether or not this
would help depends on how fammiliar the available staff/... are fammiliar
with whatever task the compiler performs (if none know what exactly the app
does or how it works, there may be a problem... but if the task is familiar,
a clean rewrite may be in line...).

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.