Unpopular parameter passing mechanisms and parallelisation (was Re: sequential binary parallelization at run-time)

Anton Lokhmotov <al407@cam.ac.uk>
Sun, 28 Oct 2007 19:06:41 +0000

          From comp.compilers

Related articles
sequential binary parallelization at run-time wangli.one@gmail.com (Lauren) (2007-10-25)
Re: sequential binary parallelization at run-time gneuner2@comcast.net (George Neuner) (2007-10-27)
Unpopular parameter passing mechanisms and parallelisation (was Re: se al407@cam.ac.uk (Anton Lokhmotov) (2007-10-28)
Re: Unpopular parameter passing mechanisms and parallelisation torbenm@tyr.diku.dk (2007-10-29)
Re: Unpopular parameter passing mechanisms and parallelisation haberg@math.su.se (2007-10-29)
Re: Unpopular parameter passing mechanisms and parallelisation (was Re gneuner2@comcast.net (George Neuner) (2007-10-30)
Re: Unpopular parameter passing mechanisms and parallelisation (was Re gah@ugcs.caltech.edu (glen herrmannsfeldt) (2007-10-30)
Re: Unpopular parameter passing mechanisms and parallelisation al407@cam.ac.uk (Anton Lokhmotov) (2007-10-31)
Re: Unpopular parameter passing mechanisms and parallelisation wangli.one@gmail.com (Lauren) (2007-11-01)
[1 later articles]
| List of all articles for this month |

From: Anton Lokhmotov <al407@cam.ac.uk>
Newsgroups: comp.compilers
Date: Sun, 28 Oct 2007 19:06:41 +0000
Organization: Compilers Central
References: 07-10-082 07-10-089
Keywords: design
Posted-Date: 28 Oct 2007 17:29:37 EDT

George Neuner wrote:
> I been thinking about adaptive parallelization for a while and my
> own opinion, FWIW, is that the right way to accomplish it on
> close-coupled shared memory systems is to aggressively compile
> call-by-need, make each call-by-need function an independently
> schedulable entity (ie. a microthread), and structure the code as a
> state machine graph of microthreads ordered by data dependencies...


> Few languages currently implement call-by-need and those that do try
> to avoid it as much as possible because aggressively creating
> functions is inefficient for a single CPU.


It's interesting you mention that call-by-need (CBN) parameter passing
semantics is not particularly popular today. I'm not an expert in CBN
but my impression that it only makes sense in functional languages (or
languages allowing to specify that a function is pure, i.e. does not
have side-effects), and we know that functional programs are
well-suited for parallelisation.


As for imperative programming, another undeservedly abandoned
parameter passing semantics is call-by-value-result (CBVR). In CBVR,
writes to the caller's memory by the callee do not happen until the
callee returns but the callee keeps track of them. For example:


fCBVR(int a) { // copy in
      a = 1; // delay write to the caller's memory but make a local copy
      print a; // prints 1 by reading from the local copy
} // copy out


main() {
      int z;
      z = 0;
      fCBVR(z); // update z
      print z; // prints 1
}


The designers of the recent Sequoia programming language
(http://sequoia.stanford.edu) define a task (the principal construct
in the language) as a side-effect free function having CBVR parameter
passing semantics. They admit that CBVR is not common in modern
languages but argue that CBVR is natural on machines where with
software-controlled memory transfers between distinct physical
memories (e.g. Sony/Toshiba/IBM Cell). I do agree with this because
CBVR allows one to read input data in, compute, and then write results
out, with a potential for optimising transfers between distinct
memories.


In our recent research on Codeplay's sieve construct
(http://www.cl.cam.ac.uk/~al407/research/papers/eupar07.pdf) we coined
a term call-by-value-delay-result (CBVDR) to explain the semantics of
sieve blocks (cf. Sequoia's tasks). CBVDR is a variation of CBVR in
which writes to the caller's memory are not visible to the callee. For
example:


fCBVDR(int a) { // copy in
      a = 1; // delay write to the caller's memory
      print a; // prints 0, since write is delayed and there's no local copy
} // copy out


main() {
      int z;
      z = 0;
      fCBVDR(z); // update z
      print z; // prints 1
}


This semantics is similar to that of vector assignment statements in
Fortran 90 (which we discussed here around August) and allows the
callee's code to be automatically parallelised even if it contains
references or pointers that can potentially alias (which complicates
dependence analysis in mainstream programming languages).


In Sequoia, a task cannot contain references or pointers, with the
intention that a task's definition should manifest its working set,
for which the language provides in/out/inout keywords as parameter
specifiers. Moreover, aliasing of task arguments results in undefined
behaviour (. Thus, the language restricts parameters similar to the
'restrict' keyword in C99, which should enable automatic
parallelisation of task code as well.


I think that while the sieve construct elegantly hits both programming
for machines with hierarchical memories and the problem of aliasing,
Sequoia's tasks separate these concerns. Overall, both approaches are
aimed at the same problem, and we are likely to see more variations on
the same theme in future programming languages.


Anton.
--
http://www.cl.cam.ac.uk/~al407



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.