Re: EQNTOTT Vectors of 16 bit Numbers

bernecky@eecg.toronto.edu (Robert Bernecky)
Tue, 21 Nov 1995 15:25:55 GMT

          From comp.compilers

Related articles
EQNTOTT Vectors of 16 bit Numbers [Was: Re: Yikes!!! New 200Mhz Intel glew@ichips.intel.com (1995-11-09)
Re: EQNTOTT Vectors of 16 bit Numbers [Was: Re: Yikes!!! New 200Mhz In pardo@cs.washington.edu (1995-11-14)
Re: EQNTOTT Vectors of 16 bit Numbers cdg@nullstone.com (1995-11-17)
Re: EQNTOTT Vectors of 16 bit Numbers hbaker@netcom.com (1995-11-19)
Re: EQNTOTT Vectors of 16 bit Numbers cliffc@ami.sps.mot.com (1995-11-20)
Re: EQNTOTT Vectors of 16 bit Numbers bernecky@eecg.toronto.edu (1995-11-20)
Re: EQNTOTT Vectors of 16 bit Numbers bernecky@eecg.toronto.edu (1995-11-21)
| List of all articles for this month |
Newsgroups: comp.compilers
From: bernecky@eecg.toronto.edu (Robert Bernecky)
Keywords: benchmarks, optimize, APL
Organization: University of Toronto, Computer Engineering
References: 95-11-079 95-11-132 95-11-164
Date: Tue, 21 Nov 1995 15:25:55 GMT

hbaker@netcom.com (Henry Baker) writes:
> pardo@cs.washington.edu (David Keppel) wrote:
>
>> >["Intel's special SPEC optimization."]
>>
>Actually, IBM's APL implementation was extremely well done, because
>it optimized the code that got executed the most often. The thing


Well, actually, no APL did a very good job of "optimizing the
code that got executed the most often", namely -- storage management,
syntax analysis, conformability and type checking etc. We tried,
but there is still a lot to be said for compiled code.
What almost EVERY APL implementation did and did well was to
optmize array operations to a degree far beyond the ability of
most programmers. This is why the next paragraph holds.


>that embarrassed the Fortran'ers of the world was that fact that a
>hand-optimized loop that dominates a computation can beat the pants
>off an optimizing compiler every day of the week. For
>not-terribly-large arrays, the APL interpreter itself takes very
>little of the overall speed. The only time APL bogs down is when you
>don't take advantage of the built-in array operations.
>
>There was a problem with memory usage on non-compiled APL implementations,
>but that is a different story entirely. Also, the APL approach wouldn't
>work so well on modern machines, because modern machines are much more
>sensitive to memory usage & locality.


I am not familiar with ANY successful COMMERCIAL compiled APL,
[and only familiar with a few non-commercial compiled APL systems,
including my own]. However, APL systems tend to perform
excellently on cache machines, getting hit rates consistently above
the "standard mix" for a given machine, purely because of array
operations that tend to be implemented as stride-1 operations.


The place where APL interpreters have traditionally fallen down is
in their naivity -- they fail [for good reasons] to perform
loop fusion, CSE, etc. The ONLY APL interpreter I know of that
does this at all is IBM's APL2 for the 3090 Vector Facility,
where the VF does a good job of hosting array operations that would
otherwise leave you register-starved.


Bob


--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.