Re: behavior-preserving optimization in C, was compiler bugs

Nathaniel McIntosh <mcintosh@cup.hp.com>
Sat, 16 May 2009 08:39:28 +0100

          From comp.compilers

Related articles
[7 earlier articles]
Re: compiler bugs cfc@shell01.TheWorld.com (Chris F Clark) (2009-05-07)
Re: compiler bugs anton@mips.complang.tuwien.ac.at (2009-05-10)
Re: behavior-preserving optimization in C, was compiler bugs cdg@nullstone.com (Christopher Glaeser) (2009-05-12)
Re: behavior-preserving optimization in C, was compiler bugs anton@mips.complang.tuwien.ac.at (2009-05-13)
Re: behavior-preserving optimization in C, was compiler bugs dnovillo@acm.org (Diego Novillo) (2009-05-15)
Re: behavior-preserving optimization in C, was compiler bugs cdg@nullstone.com (Christopher Glaeser) (2009-05-15)
Re: behavior-preserving optimization in C, was compiler bugs mcintosh@cup.hp.com (Nathaniel McIntosh) (2009-05-16)
Re: behavior-preserving optimization in C, was compiler bugs pertti.kellomaki@tut.fi (Pertti Kellomaki) (2009-05-18)
Re: behavior-preserving optimization in C, was compiler bugs gah@ugcs.caltech.edu (glen herrmannsfeldt) (2009-05-18)
Re: behavior-preserving optimization in C, was compiler bugs torbenm@pc-003.diku.dk (2009-05-19)
Re: behavior-preserving optimization in C, was compiler bugs torbenm@pc-003.diku.dk (2009-05-19)
Re: behavior-preserving optimization in C, was compiler bugs gneuner2@comcast.net (George Neuner) (2009-05-19)
Re: behavior-preserving optimization in C, was compiler bugs bobduff@shell01.TheWorld.com (Robert A Duff) (2009-05-19)
[14 later articles]
| List of all articles for this month |
From: Nathaniel McIntosh <mcintosh@cup.hp.com>
Newsgroups: comp.compilers
Date: Sat, 16 May 2009 08:39:28 +0100
Organization: Compilers Central
References: 09-04-072 09-04-086 09-05-010 09-05-022 09-05-028 09-05-038 09-05-039 09-05-050 09-05-055 09-05-065 09-05-069
Keywords: optimize
Posted-Date: 18 May 2009 12:53:41 EDT

| An optimizer that breaks a program is a bad idea. There are
| apologists (of program-breaking optimizers) that claim that the
| program was already broken without the optimizer, because it does not
| conform to some language standard. But actually the program does
| conform with the language as it is implemented by the compiler without
| optimization and it behaves as intended by the programmer, so it is
| correct.


Anecdote from my own personal experience as a compiler writer: at one
point while working on an optimization to improve data cache behavior
for C/C++ programs, I ran into an application (let's call it
"Application T") containing the following code (greatly simplified):


        foo.c bar.c
        ----- -----
        ... ...
        double x; int x = 0;
        ... int garbage;
                                                ...


Most C compilers will compile these two modules without complaint;
when you link foo.o and bar.o into an executable, the "strong"
definition of "x" from bar.o is favored over the "weak" definition in
foo.o, and you wind up with a final "x" entity that is 4 bytes in size
(assuming an ILP32 compilation model), not 8 bytes.


In spite of the fact that foo.c contains functions that store 8-byte
values to "x", the program works without optimization because the
variable "garbage" (unused as it turned out) happens to be allocated
just after "x" in memory. If the optimizer plays with the storage
allocation such that some other critical variable appears just after
"x", then the application fails.


This is certainly a case where optimization perturbs program behavior,
and I can think of many other similar cases regarding variable
allocation, use of stack frames, etc. In this example, it is possible
for me to imagine designing/implementing the compiler so that there
would be no behavior change due to storage layout diffs. For example,
you could simply pick a specific recipe for laying out all
global/static variables in the program (ex: alphabetical order, or
perhaps order in which the variables appear in the source file) and
hold religiously to that order, even if it causes extra padding to be
introduced between variables, and even if it eliminates any
possibility of storage layout optimizations.


I would in fact argue, however, that the compiler is doing the
programmer a favor by causing the program to crash when compiled with
optimization. My reasoning: while there is some pain up front for the
application writer (e.g. he/she has to track down the reason for the
crash when optimization is turned on), the good news is that the
programming error is caught early (e.g. in the lab, during the
compiler evaluation process), as opposed to "lurking" in the code and
turning up only later at some much less pleasant time. The point here
being that programming errors of this sort can just as easily be
converted from latent problems to actual crashes by source code
modifications as opposed to optimization.


For example, suppose that the "Application T" developers are using a
compiler that slavishly sticks to the same variable layout regardless
of optimization level, so as to avoid "behavior changes" as a result
of optimization. The buggy code doesn't cause a failure, so
"Application T" is run through QC testing and is shipped out to
customers. Months later a critical bug is filed by an important
customer; programmers rush to make a fix. An engineer determines that
the problem is in routine "bar.c", and so makes a change to the module
that happens to slightly reorder the variables (including our bad
variable "x"). The test suite is run to verify the bugfix, and lo and
behold, the application fails in some mysterious way that seems to
have nothing to do with the change. Now the hapless engineer has to
track down the (totally unrelated) bug while the clock ticks and
everyone waits for the critical fix. Would things not have been much
better if the engineer had learned about the bug many months ago when
optimization was turned on? :-)


I think most programmers given a choice would prefer to find out about
programming errors such as these as soon as possible-- having a
compiler that covers them up (in some sense) only postpones the pain,
it doesn't eliminate it.


NM
[I'd rather use a linker that diagnoses type mismatches. It's not
exactly a new idea; I know one that did that in 1976. -John]



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.