papers about loop transformations (Lode Nachtergaele)
Wed, 7 Apr 1993 14:31:42 GMT

          From comp.compilers

Related articles
Graphs generated by predicates (1993-04-05)
papers about loop transformations (1993-04-07)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Lode Nachtergaele)
Keywords: optimize, bibliography, theory
Organization: IMEC vzw Leuven
References: 93-04-020
Date: Wed, 7 Apr 1993 14:31:42 GMT

Hello world,

Uwe Assman asked for the precise pointer to the paper of M.E. Wolf, but
people mailed their reference list about this subject. So here is our list
of references.

1. Loop data flow analysis.

a. group P.Featrier (Univ. M.Curie, Paris): paper in International
Journal of Parallel Programming Vol. 20 No. 1, Feb. 1991 : "Dataflow
Analysis of Array and Scalar References" This handles formal data flow
analysis for sets of nested loops with manifest affine index expressions
including manifest conditions. Equations are set up to calculate
dependencies which are solved with their PIP (Parametric Integer
Programming) package. The result could be written out in an applicative
form (but it is currently not).

b. group Pugh (Univ. of Maryland): papers in Supercomputing'91 and
Sigplan'92. This also handles formal data flow analysis for sets of
nested loops with manifest affine index expressions but it can include
also user specified assertions on non-manifest conditions which make it
more general (though not general enough for all our applications).
Equations are set up to check dependencies which are passed to their
"omega"-test based on Fourier-Motzkin elimination. The result can be
written out in an (more or less) applicative form.

c. group Monica Lam (Stanford University): thesis of Dror Eliezer Maydan,
"Accurate analysis of array references", CSL-TR-92-547 (STAN-CS-92-1449),
september 1992

In addition there are a few older refs worth mentioning:

d. J.Bu (Delft): papers ISCAS/ICASSP'88, thesis 90 Provides data flow
analysis on restricted set of nested loops.

e. S. Rajopadhye (University of Oregon): paper in IMEC Workschop on Formal
Methods, November 89 "Algebraic transformations in Systolic Arrays
synthesis: A case study" Provides some algebraic transformations on Affine
Recurrence Equations to change dependencies. The transformations rely on
algebraic characteristics of the operators.

2. Analysis techniques to check whether (a) particular class(es) of loop
trafos can be applied. No full data flow analysis happens in this case.

a. U. Banerjee (?): paper '89 in Proc. 2nd Workshop Languages Compilers
Parallel Computing : "A theory of loop permutation" Here, sets of nested
URE loops are checked for a potential permutation. This can also be
solved if data flow analysis is performed first to derive a single
assignment form.

b. D.Padua, M. J. Wolfe (University of Illinois): papers Proc ACM'86 -
Supercomputing'90. This provides a survey of loop trafo analysis
techniques (under restrictions) + a way to check loop interchanging for
manifest affine index expressions. Also the decomposition of uni-modular
loop trafos into skew/reversal/inter- change is proposed here (though not

c. Randy Allen, Ken Kennedy (Rice University, Houston) : paper in IEEE
Transactions on computers, Vol. 41, No. 10, october 1992, "Vector Register
Allocation" Interesting survey paper on register allocation techniques
which includes analysis on when loop trafo can be applied.

3. Extraction of parallelism in presence of nested loops:

a. Polychronopoulos (University of Illinois): paper in IEEE Transactions
on Computers, Vol. 37, No. 8, August 1988 "Compiler Optimizations for
Enhancing Parallelism and their impact on architecture design" Survey of
loop trafo classes with some analysis on how to extract parallelism but
not really automated.

b. M.J. Wolfe (Oregon Graduate Institute of Science and Technology): paper
The Journal of Super Computing 4,321-344, 1990 : "Data dependence and
Program Restructuring" Maximal parallelism is found for sets of UREs with
lexicographically ordered index expressions (quite restricted).

c. Pugh-Wonnacott (Univ. of Maryland): report UMIACS-TR-92-126
(CS-TR-2994) They find maximal parallelism for sets of nested loops with
manifest affine index expressions but including the user specified
assertions on non-manifest conditions.

d. Shang-Fortes (?): paper in Algorithms and Parallel VLSI Architectures
II, P.Quiton and Y.Robert (eds.), 1992 Also here parallelism is detected
for sets of nested loops with manifest affine index expressions.

4. Methods to perform piece-wise linear scheduling.

a. M.E. Wolf, M. Lam (Stanford University) : paper IEEE Transactions on
parallel and distributed systems, Vol.2, No.4, October 1991 "A loop
transformation theory and an algorithm to maximize parallelism" Very good
paper on unimodular transformation of a loop structure. An general
algorithm to generate the loop structure after transformation is proposed.

b. L.Thiele (Univ. Saarland), "On the design of piecewise regular
processor arrays", ISCAS'89, pp. 2239-2242. Original work on this topic,
but not really automated.

c. Alain Darte, Tanguy Risset, Yves Robert, "Loop nest scheduling and
transformations", to appear in Environments and tools for Parallel
Scientific Computing, J.J. Dongarra and B.Tourancheau eds, North Holland,

d. Leslie Lamport : "The parallel execution of do loops" Communication of
the ACM, 17(2):83-93, February 1974

e. W.Kelly, W.Pugh (Univ. of Maryland) : report in
UMIACS-TR-92-126 (CS-TR-2995) November 1992 "Generating Schedules and Code
within a Unified Reordering Transformation Framework" Describes an
algorithm to compute transformations to obtain maximum parallellism. Gives
a method to genererate code after transformation.

f. C.-H. Huang, P. Sadayappan, "Communication-Free Hyperplane Partitioning
of Nested Loops", Languages and Compilers for parallel Computing, Fourth
International Workshop, Santa Clara, California, August 1991

g. M. Neeracher, R.Ruhl, "Automatic Parallelization of LINPACK Routines on
distributed Memory of Parallel processors", Proceedings 7th International
Parallel Programming Symposium, April 1993

5. Singular Loop transformations papers ftp'ed from Department of Computer
Science, University of Cornell a. Wei Li, Keshav Pingali, "A singular loop
transformation framework based on non-singular matrices", Proceedings of
the Fifth Annual Workshop on Language and Compilers for Parallelism, New
Haven, August, 1992

b. Wei Li, Keshav Pingali, "Access Normalization : Loop restructuring for
NUMA Compilers", ACM SIGPLAN Notices, Vol 27, Number 9, Sep. 1992

c. Wei Li, Keshav Pingali, "Loop transformation for NUMA Machines", to
appear in SIGPLAN Notices, 1993

d. R. Johnson, Wei Li, Keshav Pingali, "An executable representation of
distance and direction", Languages and Compilers for parallel Computing,
Fourth International Workshop, Santa Clara, California, August 1991

6. The work in the following three references is related to automated
control flow optimization for DSP memory management. In the papers a
method is presented to automatically generate a sequence of unimodular
transformations in order to optimize memory needs.

Group of F.V.M. Catthoor (IMEC, Belgium) :

                M.F.X.B. van Swaaij, F.H.M. Franssen, F.V.M. Catthoor, H.J. De Man.
                "Modeling data flow and control flow for high level memory
                  management", European Design Automation Conference, pp. ,1992.

                M.F.X.B. van Swaaij, F.H.M. Franssen, F.V.M. Catthoor, H.J. De Man.
                "Modeling data flow and control flow for DSP system synthesis",
VLSI Design Methodologies for DSP Systems, M. Bayoumi editor,
Kluwer, 1993.

                M.F.X.B. van Swaaij, F.H.M. Franssen, F.V.M. Catthoor, H.J. De Man.
                "Automating high level control flow transformations for
                  DSP memory management",
                Proceedings of the IEEE Workshop on VLSI signal processing, 1992.

Papers we are still looking for :

a. C.Ancourt, F.Irigoin, "Scanning polyhedra with DO loops", Third ACM
symposium on Principles and Practice of Parallel Programming, p.39-50,
April 1991

b. L.Lu, "A Unified framework for systematic loop transformations", Third
ACM Symposium on Principles and Practice of parallel Programming, p.
28-38, April 1991

Frank Franssen
IMEC Laboratory,
Kapeldreef 75,
B-3001 Leuven,

Email: franssen@imec
tel: ++32-16-281512
fax: ++32-16-281515


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.