Wed, 7 Apr 1993 14:31:42 GMT

Related articles |
---|

Graphs generated by predicates assmann@karlsruhe.gmd.de (1993-04-05) |

papers about loop transformations nachterg@imec.be (1993-04-07) |

Newsgroups: | comp.compilers |

From: | nachterg@imec.be (Lode Nachtergaele) |

Keywords: | optimize, bibliography, theory |

Organization: | IMEC vzw Leuven |

References: | 93-04-020 |

Date: | Wed, 7 Apr 1993 14:31:42 GMT |

Hello world,

Uwe Assman asked for the precise pointer to the paper of M.E. Wolf, but

people mailed their reference list about this subject. So here is our list

of references.

1. Loop data flow analysis.

a. group P.Featrier (Univ. P.et M.Curie, Paris): paper in International

Journal of Parallel Programming Vol. 20 No. 1, Feb. 1991 : "Dataflow

Analysis of Array and Scalar References" This handles formal data flow

analysis for sets of nested loops with manifest affine index expressions

including manifest conditions. Equations are set up to calculate

dependencies which are solved with their PIP (Parametric Integer

Programming) package. The result could be written out in an applicative

form (but it is currently not).

b. group Pugh (Univ. of Maryland): papers in Supercomputing'91 and

Sigplan'92. This also handles formal data flow analysis for sets of

nested loops with manifest affine index expressions but it can include

also user specified assertions on non-manifest conditions which make it

more general (though not general enough for all our applications).

Equations are set up to check dependencies which are passed to their

"omega"-test based on Fourier-Motzkin elimination. The result can be

written out in an (more or less) applicative form.

c. group Monica Lam (Stanford University): thesis of Dror Eliezer Maydan,

"Accurate analysis of array references", CSL-TR-92-547 (STAN-CS-92-1449),

september 1992

In addition there are a few older refs worth mentioning:

d. J.Bu (Delft): papers ISCAS/ICASSP'88, thesis 90 Provides data flow

analysis on restricted set of nested loops.

e. S. Rajopadhye (University of Oregon): paper in IMEC Workschop on Formal

Methods, November 89 "Algebraic transformations in Systolic Arrays

synthesis: A case study" Provides some algebraic transformations on Affine

Recurrence Equations to change dependencies. The transformations rely on

algebraic characteristics of the operators.

2. Analysis techniques to check whether (a) particular class(es) of loop

trafos can be applied. No full data flow analysis happens in this case.

a. U. Banerjee (?): paper '89 in Proc. 2nd Workshop Languages Compilers

Parallel Computing : "A theory of loop permutation" Here, sets of nested

URE loops are checked for a potential permutation. This can also be

solved if data flow analysis is performed first to derive a single

assignment form.

b. D.Padua, M. J. Wolfe (University of Illinois): papers Proc ACM'86 -

Supercomputing'90. This provides a survey of loop trafo analysis

techniques (under restrictions) + a way to check loop interchanging for

manifest affine index expressions. Also the decomposition of uni-modular

loop trafos into skew/reversal/inter- change is proposed here (though not

proven).

c. Randy Allen, Ken Kennedy (Rice University, Houston) : paper in IEEE

Transactions on computers, Vol. 41, No. 10, october 1992, "Vector Register

Allocation" Interesting survey paper on register allocation techniques

which includes analysis on when loop trafo can be applied.

3. Extraction of parallelism in presence of nested loops:

a. Polychronopoulos (University of Illinois): paper in IEEE Transactions

on Computers, Vol. 37, No. 8, August 1988 "Compiler Optimizations for

Enhancing Parallelism and their impact on architecture design" Survey of

loop trafo classes with some analysis on how to extract parallelism but

not really automated.

b. M.J. Wolfe (Oregon Graduate Institute of Science and Technology): paper

The Journal of Super Computing 4,321-344, 1990 : "Data dependence and

Program Restructuring" Maximal parallelism is found for sets of UREs with

lexicographically ordered index expressions (quite restricted).

c. Pugh-Wonnacott (Univ. of Maryland): report UMIACS-TR-92-126

(CS-TR-2994) They find maximal parallelism for sets of nested loops with

manifest affine index expressions but including the user specified

assertions on non-manifest conditions.

d. Shang-Fortes (?): paper in Algorithms and Parallel VLSI Architectures

II, P.Quiton and Y.Robert (eds.), 1992 Also here parallelism is detected

for sets of nested loops with manifest affine index expressions.

4. Methods to perform piece-wise linear scheduling.

a. M.E. Wolf, M. Lam (Stanford University) : paper IEEE Transactions on

parallel and distributed systems, Vol.2, No.4, October 1991 "A loop

transformation theory and an algorithm to maximize parallelism" Very good

paper on unimodular transformation of a loop structure. An general

algorithm to generate the loop structure after transformation is proposed.

b. L.Thiele (Univ. Saarland), "On the design of piecewise regular

processor arrays", ISCAS'89, pp. 2239-2242. Original work on this topic,

but not really automated.

c. Alain Darte, Tanguy Risset, Yves Robert, "Loop nest scheduling and

transformations", to appear in Environments and tools for Parallel

Scientific Computing, J.J. Dongarra and B.Tourancheau eds, North Holland,

1993

d. Leslie Lamport : "The parallel execution of do loops" Communication of

the ACM, 17(2):83-93, February 1974

e. W.Kelly, W.Pugh (Univ. of Maryland) : report in ftp.cs.umd.edu

UMIACS-TR-92-126 (CS-TR-2995) November 1992 "Generating Schedules and Code

within a Unified Reordering Transformation Framework" Describes an

algorithm to compute transformations to obtain maximum parallellism. Gives

a method to genererate code after transformation.

f. C.-H. Huang, P. Sadayappan, "Communication-Free Hyperplane Partitioning

of Nested Loops", Languages and Compilers for parallel Computing, Fourth

International Workshop, Santa Clara, California, August 1991

g. M. Neeracher, R.Ruhl, "Automatic Parallelization of LINPACK Routines on

distributed Memory of Parallel processors", Proceedings 7th International

Parallel Programming Symposium, April 1993

5. Singular Loop transformations papers ftp'ed from

ftp.cs.cornell.edu:pub/TyphoonCompiler/papers-ps/ Department of Computer

Science, University of Cornell a. Wei Li, Keshav Pingali, "A singular loop

transformation framework based on non-singular matrices", Proceedings of

the Fifth Annual Workshop on Language and Compilers for Parallelism, New

Haven, August, 1992

b. Wei Li, Keshav Pingali, "Access Normalization : Loop restructuring for

NUMA Compilers", ACM SIGPLAN Notices, Vol 27, Number 9, Sep. 1992

c. Wei Li, Keshav Pingali, "Loop transformation for NUMA Machines", to

appear in SIGPLAN Notices, 1993

d. R. Johnson, Wei Li, Keshav Pingali, "An executable representation of

distance and direction", Languages and Compilers for parallel Computing,

Fourth International Workshop, Santa Clara, California, August 1991

6. The work in the following three references is related to automated

control flow optimization for DSP memory management. In the papers a

method is presented to automatically generate a sequence of unimodular

transformations in order to optimize memory needs.

Group of F.V.M. Catthoor (IMEC, Belgium) :

M.F.X.B. van Swaaij, F.H.M. Franssen, F.V.M. Catthoor, H.J. De Man.

"Modeling data flow and control flow for high level memory

management", European Design Automation Conference, pp. ,1992.

M.F.X.B. van Swaaij, F.H.M. Franssen, F.V.M. Catthoor, H.J. De Man.

"Modeling data flow and control flow for DSP system synthesis",

VLSI Design Methodologies for DSP Systems, M. Bayoumi editor,

Kluwer, 1993.

M.F.X.B. van Swaaij, F.H.M. Franssen, F.V.M. Catthoor, H.J. De Man.

"Automating high level control flow transformations for

DSP memory management",

Proceedings of the IEEE Workshop on VLSI signal processing, 1992.

=============================================================================

Papers we are still looking for :

a. C.Ancourt, F.Irigoin, "Scanning polyhedra with DO loops", Third ACM

symposium on Principles and Practice of Parallel Programming, p.39-50,

April 1991

b. L.Lu, "A Unified framework for systematic loop transformations", Third

ACM Symposium on Principles and Practice of parallel Programming, p.

28-38, April 1991

Frank Franssen

IMEC Laboratory,

Kapeldreef 75,

B-3001 Leuven,

Belgium

Email: franssen@imec

tel: ++32-16-281512

fax: ++32-16-281515

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.