|Re: Aggressive optimization firstname.lastname@example.org (1990-10-18)|
|Re: Aggressive optimization email@example.com (1990-10-19)|
|Re: Aggressive optimization firstname.lastname@example.org (1990-10-20)|
|Re: Aggressive optimization baxter@zola.ICS.UCI.EDU (Ira Baxter) (1990-10-23)|
|Re: Aggressive optimization email@example.com (1990-10-23)|
|From:||firstname.lastname@example.org (James C Burley)|
|In-Reply-To:||email@example.com's message of 18 Oct 90 17:18:00 GMT|
|References:||<1458@exodus.Eng.Sun.COM> <13405:Oct1800:22:firstname.lastname@example.org> <email@example.com> <1990Oct18.firstname.lastname@example.org>|
|Date:||20 Oct 90 08:16:29|
In article <1990Oct18.email@example.com> firstname.lastname@example.org (Dan Bernstein) writes:
Does anyone have a compiler that can introduce a non-linear intermediate
expression and reduce around it? If so, I'm impressed. How advanced is
the symbolic algebra system included in the compiler? Can it take
advantage of range constraints, so that if it knows that x is between 0
and 7, it might set up a table to calculate ((1<<(x+2))-1)/10 quickly?
Can it manipulate floor and ceiling expressions? Can it introduce
invariants to figure out range constraints? These are all part of that
single, fundamental optimization technique.
I know, compilers are improving. Some optimizers fully automate the
dullest, most machine-dependent part of optimization---viz., figuring
out how often loops or branches are executed in practice, seeing how
long machine instructions take, and deciding on that basis whether a
given optimization is worthwhile. I really appreciate that. I won't stop
hand optimization because of it.
I'm glad to see someone remind us of these issues. Machine-implemented
optimizations are desirable for handling cases we've decided are not only
straightforward to implement via a set of rules (however complicated the code
might get), but (and this point often is overlooked) take all their "inputs"
from the source code ALONE (though some aggressive optimizations also take
"input" from the "omniscient", and sometimes incorrect, person who wrote the
optimization, as in "assume this particular case never happens, even though
it is valid in the language", and which typically can be selectively disabled
via compiler switches).
For example, as Dan suggests above, a compiler that performs range analysis
on data sets (not even just variables, but things like "X never exceeds Y by
more than 50 and is never less than Y" by looking at the source code) is a
desirable thing indeed, but even if it existed it wouldn't be enough. Why?
Because in a case where X and Y are input from the outside (a file or the
user), there may be constraints on them that are not expressed in the source
code (and for which such expression is difficult or impossible in the source
language -- C's "assert" might be an example of how to do it). (Whether the
code should check for violations of the constraints and produce error
messages, however, is entirely a "user interface" issue, ultimately: consider
reading a file produced by a companion program, for example, where the
constraints may already have been checked. When compiling the program
reading the file, one cannot assume the compiler could also read the program
writing the file to derive the constraints: not only is it hard, but in the
real world, that program's source code may not be available! And it's
executable image may be unreadable also, so forget about disassembling!)
What I think is really needed is not only a compiler (here I refer to the
total package of translating, including linking and debugging) that can
recognize and deal with the kinds of optimizations Dan talks about, but that
is easy to interact with, converse with, and to which the programmer can
easily explain other constraints and point out other optimizations worth
I've been imagining such a beast for many years and hope to start actually
building one, so I've done some thinking about it. It would be very
important to make such interaction easy, so I think the interface would have
to be as much easier to deal with compare to, say, the Macintosh GUI as that
GUI is compared to JCL. That is, the interface must be capable of
understanding the concepts behind presenting information -- avoiding
overloading, having a variety of methods of presentation (graphical, text,
combinations, and refinements thereof) available and some ability to choose
the appropriate method for displaying a given set of data, and so on.
Some of the data it might display would include that traditionally thought
of, i.e. sample input data for the program and/or the various incarnations it
has as it goes through transformations within the program. But the compiler
would also need to be able to display various interpretations of the program
itself: a representation of the data flow through a given portion of code, a
picture of where various key data sets are kept in memory (global, local,
register; virtual memory, wired/locked memory, or fast static RAM for a
system that can exploit such differences; whatever), which implementation
model is chosen for the optimized version of a given module, and so on.
Various views of the same code/data could be selected by the programmer, but
more importantly, the programmer would be able to interact with these views
to refine or annotate the information.
Refining the information might be useful when the optimizer picks a general
optimization but has at its disposal several refined versions of that
optimization, each of which goes faster than the general version. The
programmer could provide a direct suggestion that a particular refinement is
or should be valid, and the system, without having had to search all the
possible refinements itself, could verify (or trust) that the chosen
refinement is indeed correct.
Annotating the information is useful when the programmer wants to note that,
for example, "WORKARRAY should not be in global virtual memory" while working
through the allocation views of the data of the program, and this annotation
is forwarded to other views (such as the source code) so, later, the
programmer can easily find operations on the pertinent item that might
explain how it ended up that way. The annotation might even be active (vs.
a passive "comment") in that it might "yell" at the programmer when a change
he/she makes satisfies its constraint: for example, after making a change, a
message might be sent saying "Congratulations! WORKARRAY is now in fast
Further, this system would keep information on the program in a persistent
fashion so the information would not have to be entered again and again. To
do that, it would be best for the system to serve as the "source editor", and
be capable of recognizing when its current categorizations of a given chunk
of code (at whatever level of granularity) were no longer applicable due to
changes in the code, and forwarding information on changes to other chunks of
code that held a "vested interest" in the changed code.
Finally, this system would not spring forward whole from someone's forehead;
it would have to have an underlying kernel language on which the system as a
whole could be incrementally built. In fact, I don't envision the kernel as
a compiler or anything in particular; it would more resemble something like a
hybrid of EMACS and Smalltalk, but very stripped down and carefully planned.
A library of often-used components involving compilation, optimization,
editing, and such would become ubiquitous just as has happened with those two
systems, and this library would grow. But many programmers would wish to,
and could easily, extend the system as desired to take into account their own
special needs (target hardware, programming languages, and so on). All of
the source code of the system should thus be made available to anyone who
wants it (a la the Free Software Foundation's GNU Public License).
The output of the system? Whatever is needed: assembly, C, C++, are
examples, but I don't see any limitations here. In fact, one could possibly
output final executables when doing global compilation and optimization,
meaning that some pretty neat optimizations some of us wish we could stick in
the linker we're using would be doable, given that we'd "rewritten" the
linker to run under this proposed system.
I think this approach may be one of the more fundamental changes in
optimization technology over the next decade or so: not because it introduces
any sexy new optimizations per se, but because it permits programmers to
easily pick apart their own creations and put them back together, without
having to rewrite the source code if desired. Such a system might allow the
industry to move back to powerful languages that "hide" potent but often
expensive operations in innocuous-looking constructs (a problem with
lanaguges like PL/I that C lovers like to point out, and which is valid on a
practical level), since the expense would be easily discoverable during the
"optimization discovery" process offered by the proposed system.
I won't get into how such a system might also provide an easier means for
debugging systems (even in "optimized" mode), writing source code (harking
back to the "what if we had a language of languages?" thread we had earlier),
or improving verifiability. As with optimization, the system itself would
break no new ground in these areas, but should provide a more fertile field
for implementing and discovering improvements in them.
After I get through with GNU Fortran, I'd like to start trying to build such
a system, so I hope that if anyone knows why it can't work, they'll let me
know before then! (-: (Seriously, either this is a great idea that I have to
do someday, or there's something for me to learn from its failure, either way
I decided pursuing this and other ideas was more important than drawing a
full-time paycheck and working on the same old thing, which is why I left my
job over a year ago...so is this idea worth pursuing as one of my next
projects? After watching the newsgroups for a while, this seems to be a good
one to ask!)
James Craig Burley, Software Craftsperson email@example.com
Return to the
Search the comp.compilers archives again.