Can compilers make a difference ?

Ariel Faigon <taux01.UUCP!arielf@BBN.COM>
11 Apr 88 06:49:23 GMT

          From comp.compilers

Related articles
Can compilers make a difference ? taux01.UUCP!arielf@BBN.COM (Ariel Faigon) (1988-04-11)
| List of all articles for this month |

Posted-Date: 11 Apr 88 06:49:23 GMT
Newsgroups: comp.sys.nsc.32k,comp.sys.sequent,comp.compilers
Keywords: Optimizing compilers, NS32000, benchmarks
Date: 11 Apr 88 06:49:23 GMT
From: Ariel Faigon <taux01.UUCP!arielf@BBN.COM>
Organization: National Semiconductor (Israel) Ltd.

Hi guys, have you ever noticed how much compilers can make a difference ?
True, contemporary optimizing compilers for imperative languages like 'C'
or Fortran or Pascal cannot make an order of magnitude improvement over
simple older compilers, because of the inherent serial execution of the
imperative model of computation, but there seem to be an encouraging trend.


Optimizing compilers improve in two main directions:


        1. Better machine independent optimization techniques are utilized:
              extensive data-flow analysis + global optimizations. Lately,
              interprocedural analysis is showing up.


        2. Better tunning for specific new hardware in order to make the
              most of it. e.g. register allocation to avoid memory references,
              code-reordering to avoid contention or pipeline breakage in
              pipelined architectures, and even techniques to ensure maximum
              utilization of given sized on-chip-caches.


---- The Present state:


Because of significant progress in both directions, todays compilers can
claim substantial improvement over traditional old compilers. Sometimes
when you cannot find a better algorithm for a real impressive speed-up nor
can you change the architecture because of some big past investment, I
suggest that you try switching compilers, you may be surprised by the results.


Well, at least I was surprised when I benchmarked National-Semiconductor
GNX/CTP compiler against other compilers which run on OPUS and Sequent Balance
machines (see below). All benchmarks are very well known programs which were
picked because of their availability and with no bias whatsoever.


Note: All figures refer to optimized programs (-O option in effect) All
sources are 'C' programs. (I assume that Pascal results will be usually
even more impressive because 'C' address-taken variables inhibit certain
optimizations). The CTP compiler supports Pascal, Modula-2 and Fortran77 in
addition to C.


VAX 785 BSD4.3 Berkeley 'C' standard (pcc) compiler times are given here
just for reference since this is a very well known Machine/OS/Compiler
combination. Remember, what is being compared is Compilers only.




VAX 785 BSD4.3 reference times Exact benchmark parameters
-----------------------------------------------------------
Ackerman: 33116 milisec. Ackerman(3,6) = 509
Puzzle: 94233 milisec. 10 loop iterations size=511
Quicksort: 9133 milisec. 10 iterations, 1000 long integers array
Sieve: 16916 milisec. 100 iterations, SIZE=8192
C Whetstone: 1783 milisec. 10 iterations (1 Million Whetstones)


-------------------- Compiler Comparison results ----------------------


All times are in Miliseconds


Set 1:
Machine - OPUS 32332 (15 MHz)
O.S. - UNIX SYS-5.3


The OPUS machine is an add-on board for IBM-XT/AT compatibles manufactured
by OPUS systems.


The Optimizing compiler from Green Hills denoted 'GH' below is:
C-32000 1.8.1(C) Copyright (c)1985,1986 Green Hills Software, Inc.


pcc1 is the Standard compiler supplied by OPUS with the machine.


----------------+-----------------------------+-----------------------+
                                | Compiler | Runtime Ratios |
Benchmark name | GH | pcc1 | GNX/CTP | GH/CTP | pcc1/CTP |
----------------+---------+---------+---------+----------+------------+
Ackerman | 16800 | 16166 | 14700 | 1.14 | 1.10 |
Puzzle | 38000 | 85233 | 35000 | 1.09 | 2.44 (!) |
Quicksort | 15766 | 10866 | 7966 | 1.98 | 1.36 |
Sieve | 10400 | 20483 | 8400 | 1.24 | 2.45 (!) |
C Whetstone | 1950 | 1966 | 1750 | 1.11 | 1.12 |
----------------+---------+---------+---------+----------+------------+


--- Notes:


In the Whetstone benchmark the CTP gain is only from the user code: the
mathematical library used was the same for both compilers. I would expect a
greater improvement if the two compilers were used on the math library sources.


With the Green Hills compiler the '-O2' optimization (full optimization)
was used instead of '-O'.


The Green Hills compiler is a good optimizing compiler. Checking the assembly
produced points out that CTP makes better machine-specific optimizations,
e.g. multiplying by a constant using 'addr' instructions.


Set 2:
Machine - Sequent Balance 32032,
O.S - Dynix (Sequent's BSD4.2 Variant)


pcc2 is the standard compiler supplied by Sequent with the machine.
All times are in miliseconds.


----------------+---------------------+-----------------+
                                | Compiler | |
Benchmark name | pcc2 GNX/CTP | pcc2/CTP ratio |
----------------+---------+-----------+-----------------+
Ackerman | 39000 | 30850 | 1.26 |
Puzzle | 177333 | 64883 | 2.73 (!) |
Quicksort | 17783 | 12316 | 1.44 |
Sieve | 44233 | 16633 | 2.65 (!) |
C Whetstone | 3783 | 3183 | 1.19 |
----------------+---------+-----------+-----------------+
--- Notes:


Again, in the whetstone program a common math-library was used so only the
user-code was actually benchmarked.


CTP as a product does support only COFF (Common Object File Format) which
is the AT&T UNIX standard and not Berkeley a.out format. The version tested
here is an in-house version with no debugging support - which is not an
official product.


---- The future is still more promising...


Results for the new NS32532 running at 30-MHz are typically 5-6 times
faster than for the NS32332 comparing CTP times on both processors. (e.g.
sieve 1316 milisec. [6.38x], Ackerman 2854 milisec. [5.15x]). Traditional
compilers produce the same code for the 32532 and for other members of the
NS32000 family. CTP can generate code that is specifically tuned for each
CPU/FPU/BUS-WIDTH combination. Thus I expect the performance ratios between
older compilers and CTP to increase as newer NS32000 hardware appears on
the horizon.


VAX, VAX-VMS Are trademarks of Digital Equipment Corp.
Sequent, Balance, Dynix Are trademarks of Sequent Computer Systems
OPUS Is a trademark of OPUS Systems
IBM XT and AT are trademarks of IBM
UNIX is a trademark of AT&T
--
Ariel Faigon, CTP group
National Semiconductor (Israel)
6 Maskit st. P.O.B. 3007, Hertzlia 46104, Israel. Tel. (972)52-522312
arielf%taux01@nsc.com @{hplabs,pyramid,sun,decwrl} 34 48 E / 32 10 N
[I would be more impressed with results on larger programs, since results on
these toy test programs can be heavily biased by compiler differences that
make little difference on larger, more realistic programs. For example,
compile troff with different compilers and compare performance on a 30-page
-mm document. -John]
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.