Related articles |
---|
High Performance Compilers Summer Courses mwolfe@dat.cse.ogi.edu (1993-04-01) |
Newsgroups: | comp.compilers |
From: | mwolfe@dat.cse.ogi.edu (Michael Wolfe) |
Keywords: | courses |
Organization: | Oregon Graduate Institute - Computer Science & Engineering |
Date: | Thu, 1 Apr 1993 17:32:08 GMT |
Once again, the Oregon Graduate Institute is offering
Summer Intensive Workshops on High Performance Compilers
Analysis and Optimization for Modern Architectures
Monday-Friday, August 16-20, 1993
Advanced Analysis and Optimizing for Parallelism
Monday-Friday, August 23-27, 1993
with Michael Wolfe
Hotel Vintage Plaza
Portland, Oregon
The past two years' courses have been well attended and received. This
year's offerings build on previous courses, but are updated and expanded.
The full details of these courses are available electronically
via anonymous ftp at:
% ftp cse.ogi.edu [or ftp 129.95.40.2]
Name: anonymous
Password: myname@where.am.i
ftp> cd pub
ftp> cd HPC
ftp> get brochure
ftp> quit
%
Or send email to
mwolfe@cse.ogi.edu (for technical inquiries)
lpease@admin.ogi.edu (for registration of a printed brochure)
What follows is the (lengthy) course outline for the two courses.
==================================================================
ANALYSIS AND OPTIMIZATION FOR MODERN ARCHITECTURES
Monday-Friday, August 16-20, 1993
Course Outline:
MONDAY:
1. Pipelined Processor Architecture
. Architectural Options
- pipelined functional units
- large register file (usually)
- vector instruction set
- cache memory
. Control Unit Options
- register windows
- VLIW/super-scalar control unit
- register renaming
- multithreading
. Representative Architectures
Multiflow Trace/300, Cydrome Cydra 5, IBM RS/6000, Intel i860,
Cray C90, Digital Alpha, Hewlett Packard PA-RISC
2. Compiler Framework
- front end
- high-level optimizations
- back-end optimizations
- code generation
- peephole optimizations
- basic blocks, control flow graph
. Basic Data Structures
- tuple, list, tree, graph
3. Data Flow Analysis
. Dataflow Problems
- live variables
- reaching definitions
- dominators
. Applications
- dead code elimination
- use-def chains, constant propagation
- loop discovery
. Dataflow Framework
- lattice framework
- iterative algorithm
. Complications
- aliasing (pointers, reference formals)
- volatile variables
. Other Dataflow Solution Methods
- syntax-based elimination methods
- interval analysis
- slotwise analysis
4. Dominator Analysis
- dominator trees
- more details on loop discovery
- unreachable code elimination
TUESDAY:
5. Machine Independent Optimizations
- constant propagation
- constant folding
- copy propagation
- constant conditional elimination
- common subexpression elimination
6. Loop Optimizations
- code floating
- strength reduction (induction variables)
- linear test replacement
- partial redundancy elimination
7. Procedure Optimizations
- leaf procedures
- tail recursion, tail calls
8. Improving Optimizations
- loop rotation
- code replication (tracing)
- procedure integration
WEDNESDAY:
9. Local Register Allocation
- minimizing register utilization
10. Register Allocation via Coloring
- spill heuristics
- coloring algorithms
- allocating register pairs
- register coalescing
- splitting live ranges
11. Other Register Allocation Heuristics
- hierarchical allocation
- vector register allocation
- register allocation in loops
THURSDAY:
12. Instruction Scheduling
- basic block scheduling
- extended basic block formation
- filling delay slots
13. Scheduling in Loops
- software pipelining
- polycyclic scheduling
- loop unrolling, peeling
- register assignment in loops
- vector instruction scheduling, chaining
FRIDAY:
14. Peephole Optimizations
- tail merging
- jump optimizations
- optimizing for branch prediction
- instruction placement
15. Interacting with Debuggers
- currency of values in registers
- currency of values in memory
- reporting values of variables
- reporting position
16. Instruction Selection
- hand-written code generators
- table-driven code generators
17. Engineering a Real Compiler
- software engineering
- time to market
- managing data structures
- interactive data structure browser
- graphical display of data structures
- 'source-level' debugging
==================================================================
ADVANCED ANALYSIS AND OPTIMIZING FOR PARALLELISM
Monday-Friday, August 23-27, 1993
Course Outline:
MONDAY:
1. Parallel Computer Architecture
. Architectural Options
- vector instruction sets
- multiple CPUs sharing memory
- multiple operations per instruction
- super-scalar control unit
- super-pipelined data unit
- cache memory organization
- cache coherence mechanisms
- massively parallel SIMD and MIMD
- message passing systems
- latency tolerating systems
- locality-based systems
. Current Examples
Intel i860, IBM RS/6000, Multiflow Trace/300, Cray C-90, Alliant FX/80,
Intel iPSC/860, Thinking Machines CM-2 and CM-5, MasPar MP-2
2. Compiler Framework
. Compiler Structure
- front end
- high-level optimizations
- back-end optimizations
- code generation
- peephole optimizations
. Internal Data Structures
- basic blocks, control flow graph
3. Basic Data Structures and Concepts
- lists, trees, graphs
- formal introduction into graphs
- graph algorithms (traversal, spanning trees, cycles)
4. Dataflow Problems in a Lattice Context
- monotone dataflow framework
- reaching definitions, live variables
- iterative solution
5. Control Flow Analysis
- dominators
- control dependence
- identifying loops
6. Static Single Assignment (Factored Use-Def Chains)
- conversion to static single assignment
- conditional constant propagation
- induction variable identification
- comparison to classical methods
- handling parallel language extensions
7. Sparse Dataflow Evaluation Graphs
- constructing the sparse graph
TUESDAY:
8. Data Dependence Analysis Techniques
. Introduction
- types: flow, anti, output dependence
- abstractions: distance, direction vectors
. Subscript Analysis
- formulating a dependence equation
- handling symbolic variables
. Solving the Dependence Equation
- single variable exact test
- Banerjee's inequalities,
- Stanford sieve
- Lambda, Delta, Omega, Power tests
. Complications
- I/O dependence
- aliasing, EQUIVALENCE, COMMON
- structure aliasing
. Beyond Classical Methods
- array kill analysis
- pointers, dynamic aliasing
- argument aliasing
. Program Dependence Graph
WEDNESDAY:
9. Non-Loop Parallelism
. Hierarchical Task Graph
10. Program Restructuring for Shared Memory
- parallelization
- scheduling parallel code
- induction variables, temporaries, etc.
- distribution
- alignment
- index set splitting
- synchronization
- interchanging
- optimizing for memory locality
- linear index set transformations
- privatization
- performance modeling
11. Restructuring for Vector Computers
- vectorization
- scalar expansion
- reductions
- strip mining
12. Restructuring for SIMD Computers
- strip mining, combing
- scalarization, fusion
13. Restructuring for Scalar Computers
- tiling for cache locality
- stride sensitivity
- handling nontightly nested loops
14. Other Restructuring Transformation
- loop fusion
- loop rotation
- handling nontightly nested loops
THURSDAY:
15. Interprocedural Analysis
. Summarizing the Effects of Procedures
- flow insensitive USE and MOD information
- summarizing USE and MOD of arrays
. Interprocedural Constant Propagation
. Interprocedural Alias Analysis
16. High Performance Fortran Language
. Parallelism
- array assignment, FORALL
. Data Distribution
- Align, Distribute, Template
. Expected Behavior
. Complications
- pointer assignments
- dynamic realignment
- inherited alignments
17. Local HPF Analysis
. Conformance
. Alignment Analysis
18. HPF Code Generation
. Generate a Node Program
- local vs. global index sets and indices
- local data allocation
. Communication
- where to place communication
- communication vectorization
FRIDAY:
19. Optimizing HPF Node Programs
. Index Set Calculation
- recovering induction variables
- finite state machine analysis
- guard regions
- overlapping communication
20. Interprocedural Analysis for HPF
. Reaching Distributions
- procedure cloning
. Guard Region Analysis
21. Other Parallel Languages
. Dataparallel C
- front-end/back-end model
- communication classification
. SISAL
- sequentialize to reduce data copying
- dynamic task scheduling
. Crystal
- automatic data alignment
- parallel code generation
- elimination of temporal domain storage
22. Current Research
. Extensions to SSA model
- better handling of aliases
. Data Restructuring
- sparse array implementation
- influencing sparse array representation
- compression of sparse index sets
. Low Level Extensions to HPF
- MetaMP extensions
- interface to HPF
- optimize MetaMP portion of program
. C++ Optimizations
- achieve performance of Fortran from C++
- expose C++ methods to optimizations
23. Engineering a Real Compiler
- software engineering
- time to market
- managing data structures
- interactive data structure browser
- graphical display of data structures
- 'source-level' debugging
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.