Re: Effectiveness of compilers today

burley@apple-gunkies.gnu.ai.mit.edu (Craig Burley)
Wed, 17 Feb 1993 18:37:59 GMT

From comp.compilers

Related articles
Effectiveness of compilers today andreasa@dhhalden.no (1993-02-14)
Re: Effectiveness of compilers today pardo@cs.washington.edu (1993-02-16)
Effectiveness of compilers today kanze@us-es.sel.de (1993-02-17)
Re: Effectiveness of compilers today jpab+@andrew.cmu.edu (Josh N. Pritikin) (1993-02-17)
*Re: Effectiveness of compilers today burley@apple-gunkies.gnu.ai.mit.edu* (1993-02-17)**
Re: Effectiveness of compilers today jbuck@forney.berkeley.edu (1993-02-17)
Re: Effectiveness of compilers today napi@cs.indiana.edu (mohd hanafiah abdullah) (1993-02-17)
Re: Effectiveness of compilers today moss@cs.cmu.edu (1993-02-18)
Re: Effectiveness of compilers today preston@dawn.cs.rice.edu (1993-02-18)
Re: Effectiveness of compilers today roth@helena.cs.rice.edu (1993-02-18)
Re: Effectiveness of compilers today pardo@cs.washington.edu (1993-02-19)
[7 later articles]

| List of all articles for this month |

Newsgroups:	comp.compilers
From:	burley@apple-gunkies.gnu.ai.mit.edu (Craig Burley)
Keywords:	optimize, design
Organization:	Free Software Foundation 545 Tech Square Cambridge, MA 02139
References:	93-02-082
Date:	Wed, 17 Feb 1993 18:37:59 GMT

andreasa@dhhalden.no (ANDREAS ARFF) writes:

      Could someone tell me more about this. Are there any tests between common
      compilers and assembler programmers. Who is the best?

I have a general observation about the issue of the quality of code
produced by compilers versus humans, which is this:

Compilers are at a disadvantage in most cases because they are not given
all the information we humans have at our disposal and they are given
restrictions that we humans can selectively ignore in cases that the
compilers can't.

Note that, in one sense, this statement could even extend from "compilers"
to "assemblers", but in another, let's assume that the human has, via the
assembler, complete control of and access to the underlying target (a UNIX
executable, an MS-DOS .EXE file, whatever). This seems to not always be
the case with some assembler/linker toolsets (one can't always specify
just what alignment is needed, as in "make sure this code starts at an
address such that dividing its byte offset by 16 yields a remainder of
3"), but one can presumably always get around that restriction by writing
the entire application in a single monolithic assembler source chunk, even
using explicit octal/hex coding in place of mnemonic instructions where
necessary. I.e., for the purposes of this article, assembler is to
machine-code target as a UNIX text editor is to a UNIX executable --
anything you need in an executable you should be able to make via the text
editor (if it's good), and, similarly, any machine-code target (executable
or .EXE file) you should be able to make (with varying degrees of effort
and elegance) via the assembler.

So, back to compilers.

The languages we use are a major part of the problem, but not all of it --
compiler technology is much of the rest of the problem, then throw in
library and O/S interfaces and implementations for a small portion.

A simple example of the language problem is C's enum:

enum { RED, GREEN, BLUE } color;

Some programmers typing this mean "I need three constants with distinct
values named RED, GREEN, and BLUE", other mean "I need three constants
named RED, GREEN, and BLUE with the values 0, 1, and 2, respectively".

C compilers must assume the latter statement (generally speaking) is
meant, when in fact in many cases only the former (a weaker one) is.

The difference from an optimization point of view is that the way the
constants might be used in a tight loop might well affect the values a
compiler assigns to them. A human might decide to give them the values 0,
4, and 8, or even 0, 256, and 512, by observing that a few cycles are
saved by not having to multiply a variable of "enum color" type by some
factor to get a jump-table offset to some code or data. A C compiler is
not likely to have this ability (safely) unless it is globally optimizing
across all program modules.

A sample solution would be a language construct that does literally mean
"I don't care about the values as long as they're distinct", but in the
context of C, it would not represent enough of an improvement to counter
the "noise factor" it would introduce to the language. That is, few
programs would benefit from it, few programmers would use it, but many
people would be put off by having to learn yet another construct in what
is supposed to be a small language (see C++).

The "global optimization" approach -- optimizing the entire program rather
than just portions of it -- provides a machine-assisted way to get around
many such language barriers.

However, that approach has two faults I can think of when it comes to
trying to match or exceed human potential:

        1. In trying to get around missing modes of programmer expression,
                i.e. in trying to reconstruct knowledge already in the
programmer's head (such as "I don't really care what the actual
values are for RED, GREEN, and BLUE") by doing a full analysis of
the program, the global optimizer spends what often is too much
time doing basic analysis and not really optimizing, and it can
_still_ fail to reconstruct that knowledge.

        2. All useful programs do some kind of I/O, but even global
                optimization does not provide for complete information on
                the formats of files accessed and their constraints.

An example of both problems is, what if the code opens a file and writes
and reads variables of type "enum color" to it? Does that mean the
programmer really wants them to be 0, 1, or 2, or can the compiler
substitute 0, 256, and 512 there as well as for memory-resident copies?
The answer is: let's ask the human compiler. If that human (as the
programmer) knows the file is only a temporary scratchpad, then the
arbitrary values are fine. If it is a permanent data base, then the 0, 1,
and 2 values must be used and, if appropriate, translated accordingly (if
it is still optimal to do so).

However, a human also can be aware that the file is a permanent data base
but one that is created and maintained by the program under construction,
so the values are in flux but about to become permanent. And, the human
also can realize that whatever values are best for the running time of the
program are what should be in the file. And that the format of the file
isn't _yet_ fixed but might be down the road. And when and how to trash
(or convert) the file when the optimal values change due to changes the
programmer makes in the code. And when the file has become fixed (the
program has been put into production use, so file conversion is no longer
a valid way to deal with "more optimal" values), the human knows to no
longer change the file-resident values for "enum color" even if the
program gets to where it wants more-optimal values in memory.

How are these things communicated to a compiler? I don't know of any
language that permits it in a way that approaches the convenience needed
to make such optimizations more feasible via an HLL than coding most or
all of the program in assembler in the first place. (I do have lots of
ideas _how_ to make this sufficiently convenient, but they're way beyond
the scope of this post.)

Of course, the C programmer can play with all these things while mucking
around with "enum { RED = 0, GREEN = 256, BLUE = 512 } color;", and that's
what a lot of people use C for anyway, but remember, we're talking about
the difference between a compiler (such as for C) and a human doing the
compilation. In the C-compiler case, unless the programmer constantly
looks at the assembler output from the compiler and notices places where
different constants could improve performance, the programmer won't know
to make these changes, whereas the human compiler can see that while
coding (while optimizing).

"enum color" is a rather weak example, seeing as there are other missing
language facilities that affect typical program optimization to a much
greater degree, but it is a simple one. And, in the right application, it
could have a huge impact -- even something like a 2x difference in how
fast graphics are displayed on a color screen, or more.

Note that speed isn't always the only issue. Sometimes space is.
Sometimes speed is the only overall issue, but given a machine's
architecture (such as the 486's cache design), space becomes a dominant
issue in optimizing certain stretches of code. And humans often know more
about when to make those tradeoffs than compilers do, especially when
things like run-time error reporting (comparatively rare) is included -- a
code path that inevitably leads to a diagnostic is not only less important
to optimize than an alternate path that doesn't, but should be optimized
perhaps for space than for speed so the alternate path runs faster (I've
even written assembler code that lets the alternate path run further
before branching to the diagnostic patch so the typical case runs faster;
there's no way, with today's languages, I could tell a compiler to do this
elegantly).

A more classic example is this:

void
array_mult (float a[], float b[], float c[], int n)
{
int i;

for (i = 0; i < n; ++i)
    c[i] = a[i] * b[i];
}

Here is the above C code directly mapped to Fortran (not as a plug-in
replacement but as a programmer-level replacement):

            SUBROUTINE ARRAYMULT (A, B, C, N)
            INTEGER N
            REAL A(N), B(N), C(N)
            INTEGER I
C
            DO I=1, N
                  C(I) = A(I) * B(I)
            END DO
C
            END

With regard to human implementation in machine code, and assuming the
human has the full program available (i.e. globally optimizes), both
procedures can be compiled to exactly the same machine code, i.e. they are
identical. However, a C compiler cannot ever, on some architectures, make
the C version run as fast as Fortran compiler can with the Fortran
version.

Why? Because the Fortran language includes with the above code the
assumption that the array C does not overlap A or B with respect to the
temporary size specified via N during any invocation of ARRAYMULT. For
example, the compiled code could start off (if it wanted to do something
this weird) by zeroing C(1) through C(N), or it could save the results of
the DO loop in a temporary array and copy that into C only after the whole
DO loop were finished, and, _by definition_, the code would have the same
effect on the running program. C, of course, places no such restrictions
on the programmer's intentions.

Now, global optimization could help the C compiler determine that no
overlaps are involved in _some_ cases (and only after a potentially huge
amount of computation). But if indices are read in from a terminal or
file, it cannot, whereas Fortran happily goes along assuming no overlaps
ever occur. Period. (Of course, if indices are read in from a file, and
could potentially violate the assumption, then that is a bug in the
Fortran implementation of the program that the C implementation would not
have -- I'm assuming a "clean" file here, constructed so it meets the
requirements of the program.) The human programmer might well know that
the C version won't ever involve overlap, just as the Fortran version
doesn't, but the C compiler might not be able to intuit that.

This again highlights the "what is the context/format of data external to
the program?" question even global optimizers need to, and cannot always,
answer -- for the Fortran version, there's an implicit answer that the
compiler could conceivably use to an advantage beyond the no-overlap rules
mentioned, whereas for the C version, that implicit answer does not exist
(except in the programmer's head).

Again, there are plenty of other examples that can show a huge impact on
execution times (though the no-overlap one actually can have a huge impact
on some machines), but the above show how even simple, behind-the- scenes
assumptions made by the language designers (and hence the compilers) but
not necessarily by the programmers (or with no way to easily encode those
assumptions in the languages) can lessen a compiler's ability to match
that of a human.

Ultimately, it's best to use the most expressive language you can find to
express your intentions (one that allows you to express as much of you
knowledge as possible in a concise, yet elegant, fashion), and remain
aware of how the compiler is likely to optimize your code so you can
decide to use assembler when and where appropriate. This ultimate cannot
be realized with today's tools, but I think the industry will make a
significant lurch in that direction within the next 10 years, and I'm
working on helping that happen in my spare time (i.e. when I'm not writing
a Fortran compiler :-).
--
James Craig Burley, Software Craftsperson burley@gnu.ai.mit.edu
--

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Effectiveness of compilers today

burley@apple-gunkies.gnu.ai.mit.edu (Craig Burley)Wed, 17 Feb 1993 18:37:59 GMT

burley@apple-gunkies.gnu.ai.mit.edu (Craig Burley)
Wed, 17 Feb 1993 18:37:59 GMT