Related articles |
---|
Software Pipelining plfriko@yahoo.de (Tim Frink) (2008-08-26) |
Re:Software Pipelining Jatin_Bhateja@mentor.com (Jatin Bhateja) (2008-08-28) |
Re:Software Pipelining plfriko@yahoo.de (Tim Frink) (2008-08-28) |
Re: Software Pipelining pertti.kellomaki@tut.fi (Pertti Kellomaki) (2008-08-29) |
Re: Software Pipelining mr.neeraj@gmail.com (Neeraj Goel) (2008-09-02) |
Re: Software Pipelining sidtouati@inria.fr (Touati Sid) (2008-09-08) |
Re: Software Pipelining kamalpr@hp.com (kamal) (2008-09-10) |
Re: Software Pipelining johnhull2008@gmail.com (johnhull2008) (2008-09-11) |
[7 later articles] |
From: | Jatin Bhateja <Jatin_Bhateja@mentor.com> |
Newsgroups: | comp.compilers |
Date: | Thu, 28 Aug 2008 11:49:49 +0530 |
Organization: | Compilers Central |
References: | 08-08-072 |
Keywords: | optimize |
Posted-Date: | 28 Aug 2008 10:11:15 EDT |
It's a combination of loop unrolling and instruction scheduling. e.g
forloop (...) {
a[i] = b[i]; // stmt 1
c[i] = a[i]; // stmt 2
d[i] = c[i]; // stmt 3
}
there is a clear true dependence b/w stmt 1 , 2 and 3 which constrains
their execution order. If each statement takes one unit time then
each iteration will take 3 unit time, so if there are 2 iterations
then loop execution will take 6 unit time.
But if we unroll the loop and the reorder the instruction execution
order in such a way that all the non-dependent statements can execute
simultaneously then we can save on execution time. i.e.
1. Unrolling step .
a[0] = b[0]
c[0] = a[0]
d[0] = c[0]
a[1] = b[1]
c[1] = a[1]
d[1] = c[1]
2. After Instruction scheduling step.
a[0] = b[0]
a[1] = b[1]
c[0] = a[0]
c[1] = a[1]
d[0] = c[0]
d[1] = c[1]
Thus instructions in each of above pair of instruction can run
parallely. Thus if the architecture has multiple functional units then
we can run the above loop in 3 units times. Also we can club the
instruction present in each group in a long VLIW format if the
processor is VLIW, this will reduce the initial loop into few VLIW
instructions.
Optimization is called pipelining because we are able to overlap the
execution of multiple instruction after optimizations.
Thanks
- Jatin Bhateja
Return to the
comp.compilers page.
Search the
comp.compilers archives again.