SUMMARY: Loop transformations with unimodular matrices

assmann@karlsruhe.gmd.de (Uwe Assmann)
Tue, 6 Apr 1993 10:48:59 GMT

From comp.compilers

Related articles
Graphs generated by predicates assmann@karlsruhe.gmd.de (1993-04-05)
*SUMMARY: Loop transformations with unimodular matrices assmann@karlsruhe.gmd.de* (1993-04-06)**
Re: SUMMARY: Loop transformations with unimodul darte@ens.ens-lyon.fr (1993-04-07)

| List of all articles for this month |

Newsgroups:	comp.compilers
From:	assmann@karlsruhe.gmd.de (Uwe Assmann)
Keywords:	optimize, summary, theory
Organization:	GMD Forschungsstelle an der Universitaet Karlsruhe
References:	93-04-020
Date:	Tue, 6 Apr 1993 10:48:59 GMT

Here comes a summary of answers concerning the description of loop
transformations with unimodular matrices. Thanks to all who have
responded.

------------------------------------------------------------------------------
From: pugh@cs.umd.edu (Bill Pugh)

Actually, it was M. E. Wolf, not M. Wolfe (confusing, isn't it?). Here
are a couple of references on the subject.

U. Banerjee.
  Unimodular transformations of double loops.
  In {\em Proc. of the 3rd Workshop on Programming Languages and
    Compilers for Parallel Computing}, Irvine, CA, August 1990.

Paul Feautrier.
  Some efficient solutions to the affine scheduling problem, part i,
    one-dimensional time.
  Technical Report 92.28, IBP/MASI, April 1992.

Paul Feautrier.
  Some efficient solutions to the affine scheduling problem, part ii,
    multi-dimensional time.
  Technical Report 92.78, IBP/MASI, Oct 1992.

Wayne Kelly and William Pugh,
  Generating Schedules and Code within a
Unified Reordering Transformation Framework,
    Technical Report CS-TR-2995,
    Dept. of Computer Science, University of Maryland, College Park,
    November, 1992

K. G. Kumar, D. Kulkarni, and A. Basu.
  Deriving good transformations for mapping nested loops on hieracical
    parallel machines in polynomial time.
  In {\em Proc. of the 1992 International Conference on
    Supercomputing}, July 1992.

Wei Li and Keshav Pingali.
  A singular loop transformation framework based on non-singular
    matrices.
  In {\em 5th Workshop on Languages and Compilers for Parallel
    Computing}, Yale University, August 1992.

Lee-Chung Lu.
  A unified framework for systematic loop transformations.
  In {\em Proceedings of Third ACM SIGPLAN Symp. on the Principles \&
    Practice of Parallel Programming}, April 1991.

William Pugh.
  Uniform techniques for loop optimization.
  In {\em 1991 International Conference on Supercomputing}, pages
    341--352, Cologne, Germany, June 1991.

J. Ramanujam.
  Non-unimodular transformations of nested loops.
  In {\em Supercomputing `92}, November 1992.

Vivek Sarkar and Radhika Thekkath.
  A general framework for iteration-reordering loop transformations.
  In {\em ACM SIGPLAN'92 Conference on Programming Language Design and
    Implementation}, San Francisco, California, Jun 1992.

Michael E. Wolf and Monica S. Lam.
  A data locality optimizing algorithm.
  In {\em ACM SIGPLAN'91 Conference on Programming Language Design and
    Implementation}, 1991.

Michael E. Wolf and Monica S. Lam.
  A loop transformation theory and an algorithm to maximize
    parallelism.
  In {\em IEEE Transactions on Parallel and Distributed Systems}, July
    1991.

------------------------------------------------------------------------------

From: wak@cs.UMD.EDU (Wayne Kelly)

@inproceedings{Ban90,
                author = "U. Banerjee",
                title = "Unimodular Transformations of Double Loops",
                booktitle = "Proc. of the 3rd Workshop on Programming Languages and
                                Compilers for Parallel Computing",
                month = aug,
                year = 1990,
                address = "Irvine, CA" }

@INPROCEEDINGS{WL91,
author = {Michael E. Wolf and Monica S. Lam},
title = {A Data Locality Optimizing Algorithm},
booktitle = {ACM SIGPLAN'91 Conference on Programming Language
Design and Implementation},
year = 1991
}

@INPROCEEDINGS{WL91b,
author = {Michael E. Wolf and Monica S. Lam},
title = {A loop transformation theory and an algorithm to maximize
parallelism},
booktitle = {IEEE Transactions on Parallel and Distributed Systems},
month = {July},
year = 1991
}

Unimodular transformations is a unified framework that is able to describe
any transformation that can be obtained by compositions of loop
interchange, loop skewing, and loop reversal. Unfortunately, unimodular
transformations are limited by two facts: they can only be applied to
perfectly nested loops, and all statements in the loop nest are
transformed in the same way. They can therefore not represent some
important transformations such as loop fusion, loop distribution and
statement reordering.

We have developed a framework that generalizes unimodular transformations.
Our framework can represent a much broader set of reordering
transformations, including any transformation that can be obtained from
some combination of: loop interchange, loop reversal, loop skewing,
statement reordering, loop distribution, loop fusion, loop scaling, loop
alignment, index set splitting, loop blocking, loop interleaving, loop
coalescing. I would be happy to send you a copy of our paper if you are
interested.

---------------------------------------------------------------------------
From: wei@cs.cornell.EDU (Wei Li)

We have a matrix-oriented approach to loop transformations that uses
non-singular matrices to represent loop transformations. Non-singular
matrices generalize the unimodular approach (unimodular matrices are a
special case of non-singular matrices in which the determinant is 1 or
-1). Some important transformations such as loop tiling can only be
modeled by non-singular matrices. Furthermore, we provide a completion
algorithm that makes the theory easier to use in practice. In
transformations for parallelism and data locality, it is very useful to
have such completion algorithm. Our work was presented at the 5th
Compiler Workshop at Yale last year. A journal version is to appear soon
in IJPP.

We have used the non-singular matrix framework to develop optimizations
for data locality in our compiler for NUMA parallel machines. You can
find how the transformation matrix is constructed automatically. The
algorithms are in the paper that appeared in ASPLOS V (ACM SIGPLAN
Notices, Vol 27, Number 9, Sep. 1992).

The papers can also be found via ftp from Cornell (ftp.cs.cornell.edu,
pub/TyphoonCompiler/papers-ps/).

file: framework.ps

                "A Singular Loop Transformation Framework
                  Based on Non-singular Matrices"
                      by Wei Li and Keshav Pingali

file: asplos92.ps

                "Access Normalization: Loop Restructuring for NUMA Compilers"
                      by Wei Li and Keshav Pingali

file: pnuma.ps

                "Loop Transformations for NUMA Machines"
                      by Wei Li and Keshav Pingali
                      SIGPLAN Notices, January 1993

-- Wei Li
Department of Computer Science
Cornell University
Ithaca, NY 14853
---------------------------------------------------------------------------
From: mehrotra@csrd.uiuc.edu (Sharad Mehrotra)

Those are excellent pointers, but unimodular transformations are only one
aspect of the larger problem of automatic program parallelization. If you
are just getting started in the area, you might find the following CSRD
report (also to appear soon in the Proceedings of the IEEE) useful:

Banerjee, U., Eigenman, R., Nicolau, A., and Padua, D.,
"Automatic Program Parallelization", CSRD TR 1250, November 1992.

The report contains a timely survey of the field and a bibliography with
161 citations.

Many CSRD Tech Reports are available for anonymous ftp from host
sp2.csrd.uiuc.edu (128.174.153.4) in directory CSRD_Info/reports. We'll
try and arrange to put the PostScript for this report there soon. If it's
not available in a few days, email reinhart@csrd.uiuc.edu, and ask for a
paper copy by snail mail.

---------------------------------------------------------------------------
From: dsehr@gomez.intel.com (David Sehr)

A colleague here at Intel has been working on exactly that problem for
several years. His name is Utpal Banerjee, and he has recently published
a book describing the approach you mention (using unimodular matrices for
dependence testing and transformation). The full info:
Loop Transformations for Restructuring Compilers: the Foundations
Utpal Banerjee
Kluwer Academic Publishers
ISBN: 0-7923-9318-X

Other possibilities to pursue are the recent papers of William Pugh of the
University of Maryland (pugh@cs.umd.edu), and Paul Feautrier of the
University of P. et M. Curie (feautrier@masi.ibp.fr).

David C. Sehr, Intel Corporation
2200 Mission College Blvd., M/S RN6-18
Santa Clara, CA 95052-8119
---------------------------------------------------------------------------
From: paik@mlo.dec.com (Samuel S. Paik)

Access Normalization: Loop Restructuring for NUMA Compilers. Wei Li,
Keshav Pingali. Proceedings of the Fifth International Conference on
Architectural Support for Programming Languages and Operating Systems,
SIGPLAN Notices, Vol. 27, No. 9, Sept 1992, pp. 285-295. ACM.

    Generalizes Banerjee's work on unimodular matrices for modeling
    loop transformations to invertable matrices, and applies this to
    restructuring loops for NUMA multiprocessors. Also available as a
    Cornell University CS technical report.

---------------------------------------------------------------------------
From: Francois IRIGOIN <irigoin@cri.ensmp.fr>

Some loop transformations are equivalent to a change of basis. To preserve
the number of iterations, the change of basis matrix has to be unimodular.
This idea has been around implictly for at least 6 years. Utpal Banerjee
has a paper on the subject. He's also written a book. Perhaps, you should
get in touch with him: <banerjee@csrd.uiuc.edu>.

Non-unimodular matrices are useful for tiling transformations.

Some loop transformations cannot be put in this framework: loop
distribution and loop alignment are good examples.

I'd like to know why you are interested in this subject. We've been
active in this field for many years but did not really manage to get in
touch with people in Germany, except the SUPERB team, Hans Zima/Michael
Gerndt.

Also, there is a Workshop in June in Germany (Dagsthul) about scheduling.
Simple schedules lead to matrix transformations, complex ones cover the
other loop transformation.

Francois Irigoin tel. +33 1 64 69 48 48
Centre de Recherche en Informatique fax. +33 1 64 69 47 09
Ecole des Mines de Paris e-mail: irigoin@cri.ensmp.fr
35 rue Saint Honore irigoin@fremp11.bitnet
77300 FONTAINEBLEAU
FRANCE
------------------------------------------------------------------------------
From: jrbd@craycos.com (James Davies)

I have a paper here for ASPLOS V (1992), published by ACM, which seems
related; it's called "Access Normalization: Loop restructuring for NUMA
Compilers", by Wei Li and Keshav Pingali of Cornell University, and
discusses using invertible matrices to model loop transformations. They
refer to a paper by Utpal Banerjee in Proceedings of the Workshop on
Advances in Languages and Compilers for Parallel Processing, August 1990,
called "Unimodular Transformations of Double Loops", which is supposed to
have also used matrices to model loop transforms. (I don't know who
published this workshop, all I have is the reference in the Li-Pingali
paper.)

------------------------------------------------------------------------------
From: vadik@cs.UMD.EDU (Vadim Maslov)

Ask Dr. Pugh (pugh@cs.umd.edu) and/or Mr. Kelly (wak@cs.umd.edu) from
University of Maryland. They have a theory that goes beyond unimodular
transformations. There are a couple of papers on it which are
electronically available.

------------------------------------------------------------------------------
From: Joe Hummel <jhummel@esp.ICS.UCI.EDU>

Actually, I think it was Utpal Banerjee with his work on unimodular
transformations. His new book, which is just being published, should have
lots of info on this. Utpal also has a paper, I think it appeared in the
last year or two in the IEEE Trans on Parallel and Distributed Computing.

------------------------------------------------------------------------------
From: Lode Nachtergaele <nachterg@imec.be>

M.E. Wolfe, M. Lam, "A loop transformation theory and an algorithm to
maximize parallelism", IEEE transactions on parallel and distributed
systems, Vol.2, October 1991

Look also to the work going on at the university of Maryland. The papers
and reports can be ftp'ed from : ftp.cs.umd.edu

Lode Nachtergaele
IMEC V.Z.W.
Kapeldreef 75
3001 Heverlee
Belgium
Phone : +32 (0)16 28.15.12
E-mail: nachterg@imec.be

------------------------------------------------------------------------------
Yep. I found it in ACM SIGPLAN'91 Conference: p.30-45,
authors Michael E. Wolf (not Wolfe) and Monica S. Lam (Stanford Uni.)
--
Jan Jongejan email: jjan@cs.rug.nl
Dept. Comp.Sci.,
Univ. of Groningen,
Netherlands.
--
Uwe Assmann
GMD Forschungsstelle an der Universitaet Karlsruhe
Vincenz-Priessnitz-Str. 1
7500 Karlsruhe GERMANY
Email: assmann@karlsruhe.gmd.de Tel: 0721/662255 Fax: 0721/6622968
--

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

SUMMARY: Loop transformations with unimodular matrices

assmann@karlsruhe.gmd.de (Uwe Assmann)Tue, 6 Apr 1993 10:48:59 GMT

assmann@karlsruhe.gmd.de (Uwe Assmann)
Tue, 6 Apr 1993 10:48:59 GMT