High Performance Compilers Summer Courses

mwolfe@dat.cse.ogi.edu (Michael Wolfe)
Thu, 1 Apr 1993 17:32:08 GMT

          From comp.compilers

Related articles
High Performance Compilers Summer Courses mwolfe@dat.cse.ogi.edu (1993-04-01)
| List of all articles for this month |

Newsgroups: comp.compilers
From: mwolfe@dat.cse.ogi.edu (Michael Wolfe)
Keywords: courses
Organization: Oregon Graduate Institute - Computer Science & Engineering
Date: Thu, 1 Apr 1993 17:32:08 GMT

Once again, the Oregon Graduate Institute is offering


Summer Intensive Workshops on High Performance Compilers


Analysis and Optimization for Modern Architectures
Monday-Friday, August 16-20, 1993


Advanced Analysis and Optimizing for Parallelism
Monday-Friday, August 23-27, 1993


with Michael Wolfe
Hotel Vintage Plaza
Portland, Oregon


The past two years' courses have been well attended and received. This
year's offerings build on previous courses, but are updated and expanded.


The full details of these courses are available electronically
via anonymous ftp at:


% ftp cse.ogi.edu [or ftp 129.95.40.2]
Name: anonymous
Password: myname@where.am.i
ftp> cd pub
ftp> cd HPC
ftp> get brochure
ftp> quit
%


Or send email to
mwolfe@cse.ogi.edu (for technical inquiries)
lpease@admin.ogi.edu (for registration of a printed brochure)


What follows is the (lengthy) course outline for the two courses.


==================================================================


ANALYSIS AND OPTIMIZATION FOR MODERN ARCHITECTURES
Monday-Friday, August 16-20, 1993


Course Outline:


MONDAY:


1. Pipelined Processor Architecture
. Architectural Options
    - pipelined functional units
    - large register file (usually)
    - vector instruction set
    - cache memory
. Control Unit Options
    - register windows
    - VLIW/super-scalar control unit
    - register renaming
    - multithreading
. Representative Architectures
    Multiflow Trace/300, Cydrome Cydra 5, IBM RS/6000, Intel i860,
    Cray C90, Digital Alpha, Hewlett Packard PA-RISC


2. Compiler Framework
    - front end
    - high-level optimizations
    - back-end optimizations
    - code generation
    - peephole optimizations
    - basic blocks, control flow graph
. Basic Data Structures
    - tuple, list, tree, graph


3. Data Flow Analysis
. Dataflow Problems
    - live variables
    - reaching definitions
    - dominators
. Applications
    - dead code elimination
    - use-def chains, constant propagation
    - loop discovery
. Dataflow Framework
    - lattice framework
    - iterative algorithm
. Complications
    - aliasing (pointers, reference formals)
    - volatile variables
. Other Dataflow Solution Methods
    - syntax-based elimination methods
    - interval analysis
    - slotwise analysis


4. Dominator Analysis
    - dominator trees
    - more details on loop discovery
    - unreachable code elimination


TUESDAY:


5. Machine Independent Optimizations
    - constant propagation
    - constant folding
    - copy propagation
    - constant conditional elimination
    - common subexpression elimination


6. Loop Optimizations
    - code floating
    - strength reduction (induction variables)
    - linear test replacement
    - partial redundancy elimination


7. Procedure Optimizations
    - leaf procedures
    - tail recursion, tail calls


8. Improving Optimizations
    - loop rotation
    - code replication (tracing)
    - procedure integration


WEDNESDAY:


9. Local Register Allocation
    - minimizing register utilization


10. Register Allocation via Coloring
    - spill heuristics
    - coloring algorithms
    - allocating register pairs
    - register coalescing
    - splitting live ranges


11. Other Register Allocation Heuristics
    - hierarchical allocation
    - vector register allocation
    - register allocation in loops


THURSDAY:


12. Instruction Scheduling
    - basic block scheduling
    - extended basic block formation
    - filling delay slots


13. Scheduling in Loops
    - software pipelining
    - polycyclic scheduling
    - loop unrolling, peeling
    - register assignment in loops
    - vector instruction scheduling, chaining


FRIDAY:


14. Peephole Optimizations
    - tail merging
    - jump optimizations
    - optimizing for branch prediction
    - instruction placement


15. Interacting with Debuggers
    - currency of values in registers
    - currency of values in memory
    - reporting values of variables
    - reporting position


16. Instruction Selection
    - hand-written code generators
    - table-driven code generators


17. Engineering a Real Compiler
    - software engineering
    - time to market
    - managing data structures
    - interactive data structure browser
    - graphical display of data structures
    - 'source-level' debugging


==================================================================


ADVANCED ANALYSIS AND OPTIMIZING FOR PARALLELISM
Monday-Friday, August 23-27, 1993


Course Outline:


MONDAY:


1. Parallel Computer Architecture
. Architectural Options
    - vector instruction sets
    - multiple CPUs sharing memory
    - multiple operations per instruction
    - super-scalar control unit
    - super-pipelined data unit
    - cache memory organization
    - cache coherence mechanisms
    - massively parallel SIMD and MIMD
    - message passing systems
    - latency tolerating systems
    - locality-based systems
. Current Examples
    Intel i860, IBM RS/6000, Multiflow Trace/300, Cray C-90, Alliant FX/80,
    Intel iPSC/860, Thinking Machines CM-2 and CM-5, MasPar MP-2


2. Compiler Framework
. Compiler Structure
    - front end
    - high-level optimizations
    - back-end optimizations
    - code generation
    - peephole optimizations
. Internal Data Structures
    - basic blocks, control flow graph


3. Basic Data Structures and Concepts
    - lists, trees, graphs
    - formal introduction into graphs
    - graph algorithms (traversal, spanning trees, cycles)


4. Dataflow Problems in a Lattice Context
    - monotone dataflow framework
    - reaching definitions, live variables
    - iterative solution


5. Control Flow Analysis
    - dominators
    - control dependence
    - identifying loops


6. Static Single Assignment (Factored Use-Def Chains)
    - conversion to static single assignment
    - conditional constant propagation
    - induction variable identification
    - comparison to classical methods
    - handling parallel language extensions


7. Sparse Dataflow Evaluation Graphs
    - constructing the sparse graph


TUESDAY:


8. Data Dependence Analysis Techniques
. Introduction
    - types: flow, anti, output dependence
    - abstractions: distance, direction vectors
. Subscript Analysis
    - formulating a dependence equation
    - handling symbolic variables
. Solving the Dependence Equation
    - single variable exact test
    - Banerjee's inequalities,
    - Stanford sieve
    - Lambda, Delta, Omega, Power tests
. Complications
    - I/O dependence
    - aliasing, EQUIVALENCE, COMMON
    - structure aliasing
. Beyond Classical Methods
    - array kill analysis
    - pointers, dynamic aliasing
    - argument aliasing
. Program Dependence Graph


WEDNESDAY:


9. Non-Loop Parallelism
. Hierarchical Task Graph


10. Program Restructuring for Shared Memory
    - parallelization
    - scheduling parallel code
    - induction variables, temporaries, etc.
    - distribution
    - alignment
    - index set splitting
    - synchronization
    - interchanging
    - optimizing for memory locality
    - linear index set transformations
    - privatization
    - performance modeling


11. Restructuring for Vector Computers
    - vectorization
    - scalar expansion
    - reductions
    - strip mining


12. Restructuring for SIMD Computers
    - strip mining, combing
    - scalarization, fusion


13. Restructuring for Scalar Computers
    - tiling for cache locality
    - stride sensitivity
    - handling nontightly nested loops


14. Other Restructuring Transformation
    - loop fusion
    - loop rotation
    - handling nontightly nested loops


THURSDAY:


15. Interprocedural Analysis
. Summarizing the Effects of Procedures
    - flow insensitive USE and MOD information
    - summarizing USE and MOD of arrays
. Interprocedural Constant Propagation
. Interprocedural Alias Analysis


16. High Performance Fortran Language
. Parallelism
    - array assignment, FORALL
. Data Distribution
    - Align, Distribute, Template
. Expected Behavior
. Complications
    - pointer assignments
    - dynamic realignment
    - inherited alignments


17. Local HPF Analysis
. Conformance
. Alignment Analysis


18. HPF Code Generation
. Generate a Node Program
    - local vs. global index sets and indices
    - local data allocation
. Communication
    - where to place communication
    - communication vectorization


FRIDAY:


19. Optimizing HPF Node Programs
. Index Set Calculation
    - recovering induction variables
    - finite state machine analysis
    - guard regions
    - overlapping communication


20. Interprocedural Analysis for HPF
. Reaching Distributions
    - procedure cloning
. Guard Region Analysis


21. Other Parallel Languages
. Dataparallel C
    - front-end/back-end model
    - communication classification
. SISAL
    - sequentialize to reduce data copying
    - dynamic task scheduling
. Crystal
    - automatic data alignment
    - parallel code generation
    - elimination of temporal domain storage


22. Current Research
. Extensions to SSA model
    - better handling of aliases
. Data Restructuring
    - sparse array implementation
    - influencing sparse array representation
    - compression of sparse index sets
. Low Level Extensions to HPF
    - MetaMP extensions
    - interface to HPF
    - optimize MetaMP portion of program
. C++ Optimizations
    - achieve performance of Fortran from C++
    - expose C++ methods to optimizations


23. Engineering a Real Compiler
    - software engineering
    - time to market
    - managing data structures
    - interactive data structure browser
    - graphical display of data structures
    - 'source-level' debugging
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.