Related articles |
---|
EQNTOTT Vectors of 16 bit Numbers [Was: Re: Yikes!!! New 200Mhz Intel glew@ichips.intel.com (1995-11-09) |
Re: EQNTOTT Vectors of 16 bit Numbers [Was: Re: Yikes!!! New 200Mhz In pardo@cs.washington.edu (1995-11-14) |
Re: EQNTOTT Vectors of 16 bit Numbers cdg@nullstone.com (1995-11-17) |
Re: EQNTOTT Vectors of 16 bit Numbers hbaker@netcom.com (1995-11-19) |
Re: EQNTOTT Vectors of 16 bit Numbers cliffc@ami.sps.mot.com (1995-11-20) |
Re: EQNTOTT Vectors of 16 bit Numbers bernecky@eecg.toronto.edu (1995-11-20) |
Re: EQNTOTT Vectors of 16 bit Numbers bernecky@eecg.toronto.edu (1995-11-21) |
Newsgroups: | comp.compilers |
From: | bernecky@eecg.toronto.edu (Robert Bernecky) |
Keywords: | benchmarks, optimize, APL |
Organization: | University of Toronto, Computer Engineering |
References: | 95-11-079 95-11-132 95-11-164 |
Date: | Tue, 21 Nov 1995 15:25:55 GMT |
hbaker@netcom.com (Henry Baker) writes:
> pardo@cs.washington.edu (David Keppel) wrote:
>
>> >["Intel's special SPEC optimization."]
>>
>Actually, IBM's APL implementation was extremely well done, because
>it optimized the code that got executed the most often. The thing
Well, actually, no APL did a very good job of "optimizing the
code that got executed the most often", namely -- storage management,
syntax analysis, conformability and type checking etc. We tried,
but there is still a lot to be said for compiled code.
What almost EVERY APL implementation did and did well was to
optmize array operations to a degree far beyond the ability of
most programmers. This is why the next paragraph holds.
>that embarrassed the Fortran'ers of the world was that fact that a
>hand-optimized loop that dominates a computation can beat the pants
>off an optimizing compiler every day of the week. For
>not-terribly-large arrays, the APL interpreter itself takes very
>little of the overall speed. The only time APL bogs down is when you
>don't take advantage of the built-in array operations.
>
>There was a problem with memory usage on non-compiled APL implementations,
>but that is a different story entirely. Also, the APL approach wouldn't
>work so well on modern machines, because modern machines are much more
>sensitive to memory usage & locality.
I am not familiar with ANY successful COMMERCIAL compiled APL,
[and only familiar with a few non-commercial compiled APL systems,
including my own]. However, APL systems tend to perform
excellently on cache machines, getting hit rates consistently above
the "standard mix" for a given machine, purely because of array
operations that tend to be implemented as stride-1 operations.
The place where APL interpreters have traditionally fallen down is
in their naivity -- they fail [for good reasons] to perform
loop fusion, CSE, etc. The ONLY APL interpreter I know of that
does this at all is IBM's APL2 for the 3090 Vector Facility,
where the VF does a good job of hosting array operations that would
otherwise leave you register-starved.
Bob
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.