|Specifying time limits in source code ? firstname.lastname@example.org (1999-07-14)|
|Re: Specifying time limits in source code ? email@example.com (David Chase) (1999-07-19)|
|Re: Specifying time limits in source code ? firstname.lastname@example.org (Charles E. Bortle, Jr.) (1999-07-19)|
|Re: Specifying time limits in source code ? email@example.com (1999-07-19)|
|Re: Specifying time limits in source code ? firstname.lastname@example.org (KSG) (1999-07-19)|
|Re: Specifying time limits in source code ? email@example.com (Dan Truong) (1999-07-20)|
|Re: Specifying time limits in source code ? firstname.lastname@example.org (Ehud Lamm) (1999-07-20)|
|Re: Specifying time limits in source code ? email@example.com (Ehud Lamm) (1999-07-21)|
|Re: Specifying time limits in source code ? firstname.lastname@example.org (1999-07-23)|
|From:||Dan Truong <email@example.com>|
|Date:||20 Jul 1999 01:11:04 -0400|
For caches and prefetching, these are speed optimizations, in the best
case you go faster, otherwise you must assume you're slow as DRAM (or
even Hard disk) because cache can be flushed.
-> Cannot rely on estimations for mission critical time (a peak time
is a worst case, where you have to fetch from hard disk the page,
handle a few interrupts, miss the cache, the cache flushes...), but OK
for average performance if it is critical to have decent throughput.
- DRAM references take 50 to 100 cycles on CURRENT machines, more in the
- You can do software optimizations
- for Icache : block layout, static branch optimization, hot path
Check Pettis&Hansen90 Kaeli97, DEC WRL... and Torrellas for OS code.
- for Dcache : data layout and code tranformations (actually: arrays,
strucutres/objects, scalars have been studied, but implemented
for arrays only on compilers)
- for TLB : I know of little work on TLBs though they have a very
on large data sets (works are old too)
- Swapping : ask the database people... or add memory ands limit number
What to do: colocate data used together to fit in small number of
contiguous blocks Sw Prefetching works OK for handcrafted code, but I
don't think compilers use it well yet.
Check computer architecutre research for studies on software cache
optimizations (PLDI ASPLOS PACT HPCA MICRO ISCA... conferneces, and
associatied workshops and journals) from 1990 to now.
If you work in realtime systems, you can assume you know your
environment, and I guess even you won't be in a multitask
environment. If it's really critical, you'd use a 10 year old proven
CPU technology w/o caching, and adequate memory (ROM SRAM or FLASH),
and maybe even hand code the works. Anyhow, pay the price...
At our lab people work on case based compilation, investigate feedback
driven code optimization and the such... I believe too that profiling
will even break through as a standard compilation optimization
procedure for developpers. There's so much potential...
Note: for superscalar CPUs, you've got the same problem to compute
execution cycles: even 4way superscalar CPUs have CPIs ranging fro .5
to 1, and rarely below... Limitations come from many problems: a
limited ressource (Mem, FPU unit), complex ops, branches,
Dan N. Truong, IRISA, Campus de Beaulieu, 35042 Rennes Cedex, FRANCE
tel:(+33)2 99 84 73 36 fax:(+33)2 99 84 25 28
gsm:06 14 78 06 95 -> http://www.sfr.fr/html/annexes/sms/sendsms.html
Return to the
Search the comp.compilers archives again.