Inline block moves

John Carr <jfc@ATHENA.MIT.EDU>
Tue, 12 Nov 91 01:14:39 EST

          From comp.compilers

Related articles
Inline block moves disque@unx.sas.com (1991-11-11)
Re: Inline block moves mwm@pa.dec.comMeyer) (1991-11-11)
Inline block moves jfc@ATHENA.MIT.EDU (John Carr) (1991-11-11)
Inline block moves jfc@ATHENA.MIT.EDU (John Carr) (1991-11-12)
Re: Inline block moves christer@cs.umu.se (1991-11-12)
Re: Inline block moves Bruce.Hoult@actrix.gen.nz (1991-11-12)
Re: Inline block moves meissner@osf.org (1991-11-15)
| List of all articles for this month |

Newsgroups: comp.compilers
From: John Carr <jfc@ATHENA.MIT.EDU>
Keywords: assembler, optimize
Organization: Compilers Central
References: 91-11-035 91-11-037
Date: Tue, 12 Nov 91 01:14:39 EST

(I said that for block move on RT, load/store multiple is not the best way.)
>and the moderator noted:
>[On AIX 1.0, we went to some effort to do block moves with load and store
>multiple. I'm surprised to hear that regular loads and stores are faster.]


There are two major versions of the IBM RT. On the older version, loads
and stores can not be overlapped unless memory translation is off (the
processor doesn't save enough state to restart a memory acess after a page
fault if it continues execution past the faulting instruction). The load
multiple instruction takes 3 cycles + 2 cycles per register loaded; store
multiple takes 3 + 3 cycles per register. To load and store 2 words with
conventional instructions takes 10+10 cycles, but only 7+9 with load/store
multiple. On the newer model, loading 2 registers takes 2-6 cycles
(depending on following instructions) and storing them takes 4 cycles.


The C library didn't get updated to use the faster method, but the
compiler (Metaware High C) does generate a loop with regular loads and
stores.


Interesting footnote to this: my code is better than the compiler's,
apparently because Metware based their code on the published instruction
timings instead of reality. Such surprises are rare on the RT, but on
IBM's newer processor, the RS/6000, the only reliable way to determine the
execution time of some instruction sequences is to write the code and time
it.
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.