Re: Operation Latencies

Tommy Hoffner <tommy.hoffner@ericsson.com>
13 Feb 2004 23:54:46 -0500

          From comp.compilers

Related articles
Operation Latencies kernel_learner@yahoo.com (2004-02-08)
Re: Operation Latencies touati-nospamplease@nospam-prism.uvsq.fr (TOUATI Sid) (2004-02-12)
Re: Operation Latencies tmk@netvision.net.il (2004-02-12)
Re: Operation Latencies tommy.hoffner@ericsson.com (Tommy Hoffner) (2004-02-13)
| List of all articles for this month |
From: Tommy Hoffner <tommy.hoffner@ericsson.com>
Newsgroups: comp.compilers
Date: 13 Feb 2004 23:54:46 -0500
Organization: Ericsson
References: 04-02-090
Keywords: architecture, optimize
Posted-Date: 13 Feb 2004 23:54:46 EST

Learner wrote:
> Many operations have variable latencies...so I was wondering how
> compilers do scheduling for VLIW and how they calculate the critical
> path of a data dependence graph when the latencies of operations are
> not known at compile time.. Some scheduling like s/w pipelining
> requires calculation of RecMII which needs to know the critical path
> in the ddg... How can you be sure you have the critical path when the
> latencies can vary?


As mentioned by the other answers, you can't.


> What are the usual approaches ?


I've basically sen two:
1. State the minumum latency for each instruction in the compiler. This
will keep complexity down. In a modern RISC/VLIW machine there is too
many dynamic properties anyway. If the machine have out-of-order
execution you can't really define the actual latency penalty anyway
since parts of it might be covered by the out-of-order mechanism, or you
might be bound by TLB misses or full store queues etc, etc.


2. Assign a latency to each occurence of each instruction before feeding
the code to the scheduler. This would make it possible to state longer
latencies for first access to a cache line and shorter latencies for
repeated accesses etc. I belive there would be quite a lot of tweeking
before you get something useful. Assigning different latencies for one
instruction in a loop (possibly different for different iterations :-).
If you are specifically concerned about pipelining loops you might set
latencies specifially for the involved instructions and use default (eg
lower bound) values for the rest.


Assuming upper bounds, as stated in another response might be a really,
really bad idea if you have a compiler that schedules before register
allocation, since this would increase you register pressure, increase
spill .... (Increasing the latency values in gcc would give this effect,
and even introduce spill in loops :-(
And covering 100+ cycles for every memory access is no fun anyway :-)


If you schedule you instructions after register allocation you could, at
least in theory, assume that memory latencies ar unbounded, and let the
scheduler move things appart as far as it can. At least you shouldn't be
worse of than you were by only scheduling after register allocation.


For calculating critical paths I would say that assuming lower bounds is
good enough, at least if you're working on a highly dynamic machine.
If you increase the precision of youre model of the machine in some
respect (in this case memory latency) you would have to do it on other
details to the same level. If you don't, you can be fairly certain that
those details you don't model will come back and haunt you (by making a
fairly large number of the possible cases actually run slower:-)


/Tommy


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.