Re: sequential binary parallelization at run-time

George Neuner <gneuner2@comcast.net>
Sat, 27 Oct 2007 02:11:47 -0400

          From comp.compilers

Related articles
sequential binary parallelization at run-time wangli.one@gmail.com (Lauren) (2007-10-25)
Re: sequential binary parallelization at run-time gneuner2@comcast.net (George Neuner) (2007-10-27)
Unpopular parameter passing mechanisms and parallelisation (was Re: se al407@cam.ac.uk (Anton Lokhmotov) (2007-10-28)
Re: Unpopular parameter passing mechanisms and parallelisation torbenm@tyr.diku.dk (2007-10-29)
Re: Unpopular parameter passing mechanisms and parallelisation haberg@math.su.se (2007-10-29)
Re: Unpopular parameter passing mechanisms and parallelisation (was Re gneuner2@comcast.net (George Neuner) (2007-10-30)
Re: Unpopular parameter passing mechanisms and parallelisation (was Re gah@ugcs.caltech.edu (glen herrmannsfeldt) (2007-10-30)
Re: Unpopular parameter passing mechanisms and parallelisation al407@cam.ac.uk (Anton Lokhmotov) (2007-10-31)
[2 later articles]
| List of all articles for this month |
From: George Neuner <gneuner2@comcast.net>
Newsgroups: comp.compilers
Date: Sat, 27 Oct 2007 02:11:47 -0400
Organization: Compilers Central
References: 07-10-082
Keywords: parallel, optimize
Posted-Date: 27 Oct 2007 02:14:28 EDT

On Thu, 25 Oct 2007 11:38:38 -0000, Lauren <wangli.one@gmail.com>
wrote:


> I find the dynamic binary optimization being very popular with the
>virtual machine techniques. Most of the papers I have found are
>applying conventional compilation optimizations to the sequential
>binary to get some speedup. But in the multicore or manycore era, how
>shall we exploit various granularity parallelism at run-time? How
>extracting parallelism from the serial binary? I can't find papers
>about paralleling sequential binary at run-time.
>
> Thanks very much!
>[If anyone has the answer to this question, I know a whole lot of people
>who'd like to talk to you. -John]


No real answer, but a few comments.


JIT compilers can do simple things to sequential code like unrolling
loops and so forth, but adaptively exploiting parallel hardware at
runtime (striping and migrating computations, etc.) requires
restructuring the code to expose parallelism and make communication
and synchronization explicit. It currently isn't possible to do this
at runtime as it requires quite a bit of complex analysis.


I'm not aware of any good papers on the subject, but a place to start
would be to look at the implementation of languages like Caml and
pHaskell and at adaptive parallel runtimes like the one used by Cilk
(but ignore the Cilk language itself). I don't think any of these are
the answer, but they have some good ideas.


I been thinking about adaptive parallelization for a while and my own
opinion, FWIW, is that the right way to accomplish it on close-coupled
shared memory systems is to aggressively compile call-by-need, make
each call-by-need function an independently schedulable entity (ie. a
microthread), and structure the code as a state machine graph of
microthreads ordered by data dependencies. The parallel runtime would
operate in a similar fashion to an NFA, advancing to new states as
dependencies are satisfied and scheduling runnable microthreads across
available CPUs.


I'm slowly working on a proof of concept (and no, I haven't written
anything about it). It will be a while yet before I know if I'm on to
something or just fooling myself. However, I'm not aware of any other
effort currently going in the same direction. Few languages currently
implement call-by-need and those that do try to avoid it as much as
possible because aggressively creating functions is inefficient for a
single CPU. But I don't think that matters much - in a few years most
CPUs will be multicores and adaptive scalability will become more
important than single core speed. I'm betting most people are willing
to take a hit on a single core if their programs will automatically
scale fairly well to multiple cores.


George


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.