Re: Optimization techniques

David Brown <>
Sun, 28 Apr 2019 23:49:53 +0200

          From comp.compilers

Related articles
[23 earlier articles]
Re: Optimization techniques (Martin Ward) (2019-04-26)
Re: Optimization techniques (Martin Ward) (2019-04-26)
Re: Optimization techniques (Kaz Kylheku) (2019-04-26)
Re: Optimization techniques (2019-04-27)
Re: Optimization techniques (Hans Aberg) (2019-04-27)
Re: Optimization techniques (David Brown) (2019-04-28)
Re: Optimization techniques (David Brown) (2019-04-28)
Re: Optimization techniques and undefined behavior (Bart) (2019-04-29)
Re: Optimization techniques and undefined behavior (David Brown) (2019-04-29)
Re: Optimization techniques and undefined behavior (Christian Gollwitzer) (2019-04-29)
Re: Optimization techniques and undefined behavior (Bart) (2019-04-29)
Re: Optimization techniques and runtime checks (Hans-Peter Diettrich) (2019-04-29)
Re: Optimization techniques and undefined behavior (David Brown) (2019-04-30)
[79 later articles]
| List of all articles for this month |

From: David Brown <>
Newsgroups: comp.compilers
Date: Sun, 28 Apr 2019 23:49:53 +0200
Organization: A noiseless patient Spider
References: <> 19-04-021 19-04-023
Injection-Info:; posting-host=""; logging-data="19504"; mail-complaints-to=""
Keywords: errors, optimize
Posted-Date: 28 Apr 2019 18:34:48 EDT
Content-Language: en-GB

On 26/04/2019 02:18, Kaz Kylheku wrote:
> On 2019-04-25, David Brown <> wrote:
>> On 25/04/2019 17:36, Martin Ward wrote:
>>> David Brown <> wrote:
>>>> And there are undefined behaviours which could be given definitions, but
>>>> doing so is not actually a positive thing.  The prime example is signed
>>>> integer overflow.  Since overflowing your integers is almost always an
>>>> error in the code anyway, there are no benefits in giving it defined
>>>> behaviour.
>>> This is completely backwards. If signed overflow was given a defined
>>> behaviour (such as the two's complement result), then compilers for
>>> CPUs which do not implement two's complement operations
>>> would have to generate less efficient code (but does anyone still
>>> make such a CPU?).
>> True - but as you say, such cpus are not realistic.
>>> Any program which guarantees not to overflow would be unaffected.
>> Incorrect. Its correctness would be unaffected, but efficiency can be
>> quite a bit different.
>> The compiler knows all sorts of things when signed integer arithmetic
>> does not overflow, and can use this for optimisation.
>> It knows that "x * 2 / 2" is "x". It knows that "x + 1 > x" is true.
> Problem is, all these propositions are not safe; they are based on
> "the program has no mistake".

/All/ programming is based on the principle of "the program has no
mistake". It is absurd to single out something like this as though it
is a special case.

If I want to double a number, and write "x * 3" by mistake, it is a bug.
    If I do arithmetic on signed integers, and they overflow, it is a bug.
    The compiler is allowed to assume that when I write "x * 3", I mean
what the C language means - multiply x by 3. It is also allowed to
assume that when I write some signed arithmetic, I mean what the C
language means - give me the correct results when I give valid inputs,
otherwise it is "garbage in, garbage out".

Ask yourself, when would "x * 2 / 2" /not/ be equal to 2, given two's
complement wrapping overflows? It would happen when "x * 2" overflows.
    For example (assuming 32-bit ints), take x = 1,500,000,000. When you
write "x * 2 / 2" with an optimising C compiler, the result is most
likely 1,500,000,000 (but there are no guarantees). When you use a
"two's complement signed overflow" compiler, you get -647,483,648. Tell
me, in what world is that a correct answer? Why do you think a
guaranteed wrong and meaningless answer is better than undefined
behaviour? Instead of the compiler generated faster code for valid use,
and debug tools being able to spot your error, you have slower code with
a guaranteed error and no way for tools to help you.

Ask yourself, when would "x + 1 > x" not be true? When x is INT_MAX and
you have wrapping, x + 1 is INT_MIN. Ask yourself, is that the clearest
and best way to check for that condition - rather than writing "x ==
INT_MAX" ? When does it make sense to take a large positive integer,
add 1, and get a large /negative/ integer? It is very rare that this
might be the right answer - occasionally you need wrapping counters,
timers, etc. But generally, it is simply wrong and your code will do
something stupid and unexpected if it happens.

My preference is to write code in a clear and sensible manner, and have
the compiler generate as efficient object code as it can from that
source. I really don't appreciate a compiler that has limited
optimisations just to give the "right" result for smart-arse code
written by someone who thinks it is cool to use obscure expressions
relying on questionable assumptions. (On the other hand, I am glad that
compilers like gcc offer an escape clause like -fwrapv, for people who
have to deal with this kind of code.)

> According to the original design of C, the programmer is supposed to
> optimize for this kind of thing.

Since the first standard, I believe C had the "as if" rule. And the
first standard was written to standardise existing practice. Optimising
compilers are not new.

> The compiler should only micro-optimize: deal with the code-generation
> issues: eliminate wasteful instructions it has introduced itself via
> peephole scanning, thread jumps, do some constant folding, unroll loops,
> inlining, that sort of thing.

I disagree. I have heard this argued many times, and have seen no good
justification for it. At best, any authoritative sources I have seen
mentioned are /old/ - pre-C90, and usually pre-K&R. Perhaps it is not
unreasonable to claim that originally, C compilers could be expected to
be dumb translators - but it is certainly not the case now, and has not
been the case for the last 25-30 years. The time to argue that C
"should" do arithmetic in the manner of the underlying hardware was when
ANSI started working on their standard - it is /way/ too late now.

> Think about it: why is it okay for assembly language instruction
> sets to have defined overflow? Why don't we worry that, "Gee, if
> the add instruction has defined overflow, then the assembler
> can't deduce that 'cmp r1, r2' which follows it is always false, and
> cannot remove that instruction, as well as the whole basic block of code
> it controls."

Do you know /why/ signed addition on almost all cpus has wrapping two's
complement semantics? Do you think it is because the cpu designers
thought it would be useful whenever people add two numbers? No. The
primary reason is because it is simpler, cheaper and faster to make it
work that way - much more so than doing something useful for the
programmer, such as saturating or trapping. With a wrapping add, you
can use the same hardware for addition, subtraction, and use the same
instructions for doing multi-precision arithmetic. Wrapping signed
arithmetic for signed integers is not a planned or intentional feature,
it is merely the easiest implementation. (And you need something like
that to get multi-precision arithmetic - a feature mostly obsoleted by
modern cpus.)

Assembly is intended to be a low-level language. Some assemblers do a
little bit of optimisation, but mostly it is a language designed to give
the programmer precise control - and full responsibility - over their
code. C is a high level language - it is not an assembler. It is
defined in terms of its semantics, not the implementation.

> Figuring out that cmp r1, r2 is always false for all the inputs that can
> ever occur, and blowing away everything that is behind it, is the
> programmer's responsibility.

Yes. And high-level languages, like C, are designed to relieve the
programmer of such responsibility. (With higher languages taking more
responsibility, and control, away from the programmer.) C is one of the
lowest-level high-level languages, but it is a high-level language.

>> It knows that "for (i = 0; i <= n; i++)" will always end, after n + 1
>> iterations - after checking for n being <= 0 at the start of the loop.
> But the loop will not end if n is INT_MAX. It will predictably not end
> if i++ is well-defined behavior, and it is not required to end if i++ is
> undefined behavior.

Read the C standards. It is assumed to end (depending on what is inside
the loop). This is stated clearly and explicitly in the standards -
6.8.5p6, along with footnote 157 saying that the rule is to aid
implementations optimisations.

The authors of the C standards were not stupid. They did not do a
perfect job, of course, but they did rather a good job. They
represented compiler development groups, user groups, programmers,
academics, and other interested parties. They did put emphasis on the
behaviours of existing tools, and even more emphasis on the expected
behaviour of existing code. And they wrote, explicitly, that loops can
be assumed to end if they satisfy certain conditions.

> All we usefully know is that, if the loop ends in a well-defined way,
> then it must be that n < INT_MAX and, after the loop, i >= n.


> We know this regardless of whether i++ is always defined or not.


And if the pre-conditions don't hold, we don't know anything. So if the
loop does not end, or does not end in a well-defined way, we know
nothing about n, i, or anything else.

>> These are significant. With modern compilers doing all sorts of
>> inlining, function cloning, cross-module optimisation (link-time
>> optimisation)
> Link-time optimizations can be readily shown to violate the "phases of
> translation" abstract model given in the C standard; for that reason,
> they are not enabled by default.

I have not heard that they cause any such trouble. (That doesn't mean
it is not the case - there are many things I have not heard!) The
principle of optimisations is that they do not change the observable
behaviour of correct code, as defined by the C standards, so this would
surprise me. I think it is much more believable that LTO breaks some
more of the invalid assumptions people often make, such as the idea that
calling a function in a different translation unit enforces certain
extra ordering or constraints in the code.

LTO is not enabled by default by most compilers (there are exceptions -
compilers for some small cpus have been using LTO for decades) for three
main reasons. One is that it can significantly increase the compile
times. Another is that it can make debugging very difficult. And it
can be harder to make sure the tools don't remove more code than the
developer wanted, especially if there are more code entry points than
just main(). (Again in the embedded world, this can include interrupt

> In the C language, the translated units of a program that are to be
> linked to form the program have undergone semantic analysis.
> So no further optimization is permissible; it constitutes semantic
> analysis. Translation unit boundaries should be regarded as inviolable
> optimization barriers.

Nonsense. There is no basis for that at all - excluding unwarranted
assumptions by many programmers.

>> , and with modern C++ code bases using templates, there is
>> a great deal of scope for this kind of manipulation and optimisation.
>> People don't write "x * 2 / 2" directly, but after inlining, constant
>> propagation, constants, macros, etc., these sorts of things turn up.
> That's the most dangerous situation to be making these assumptions.
> If * 2 came from one macro, and / 2 from another, maybe it wasn't meant
> to be optimized.

If you wanted to write something else in your code, you should have
written it. If your code says "x * 2 / 2" with signed integer x, the
compiler can treat it as "x" (plus the extra knowledge that x can't be
larger than INT_MAX / 2). It doesn't matter how that came about in code

> Programmers who write labyrinths of macros and templates and whatnot in
> medium-level languages like C and C++ should be punished by poor
> performance.

Again, nonsense. People who write code in a flexible and maintainable
way, in the knowledge that the compiler will handle details, should get
the best code. People who think their compiler is a dumb translator
should use compiler flags to tell it to be a dumb translator, and let
the rest of us use real tools to do real work.

>> The other big thing about making overflow undefined is that the compiler
>> and tools can help check for mistakes. Compilers can spot cases where
>> there are overflows with compile-time constants, and tools like
>> sanitizers can add run-time checks to help debug code.
> Run-time checks for overflow can be had even when overflow is
> well-defined.

Fair enough. But it is unlikely that you can conveniently get them
automatically from tools in debug modes, as you can for many types of
invalid or undefined behaviour checks.

> Assembly language instruction sets have well-defined overflow *and*
> ways to detect it (flags, exceptions and whatnot).

Some do, yes.

>> So when your
>> program has counted up 2147483647 apples, and you add another one to the
>> cart, your debugging tools can tell you you have made a mistake. If you
>> are using wrapping integers, the tools have to believe you meant that
>> adding another apple gives you -2147483648. Your code has a bug, but
>> the tools can't help you out.
> That is simply false. We can diagnose this situation anyway.
> If that results in millions of false positives, then maybe those
> developers should rethink their abuse of wrap-around arithmetic.
> Tools can provide configurations to deal with floods of false positives.
> Diagnostic tools should never be limited to what is defined or undefined
> at the language level. They should ideally be customizeable.


> Well-defined conditions can be wrong in the given program.
> E.g. "I'd like to trap when a value greater than 42 is written into the
> variable x, even though it's well-defined to do so."

Sure. But a tool can't apply that sort of check automatically - it
needs a way to know that you consider overflowing from 42 to be an error
at that point in the code. That means manual intervention and code,
such as an "assert". Since the language says that /all/ signed
arithmetic overflows are errors, it's easy for the compiler to add
checks for these without the programmer having to add checks manually.

> Just because something is made well-defined in the language doesn't mean
> that it's blessed as a recommended practice; it's just "decriminalized".

Agreed. It's good to hear you say that - many people who argue for
making signed integer overflow defined behaviour seem to think that it
becomes a good idea, because it is defined.

> The purpose of making it well-defined isn't that programmers "go to
> town" wrapping 2147483647 to -2147483648 like it's going out of style.
> The purpose is mainly to make the software predictable. When this mistake
> happens, it will happen in the same way, with consequences deducible
> from the abstract language semantics rather than "what version V of
> compiler X does on platform Y".
> Reproducibility first, then speed.

/Correctness/ first - then speed. Reproducibility of incorrect code
does not matter, except in how it sometimes aids identifying and
correcting bugs. Reproducibility of correct code does not matter
either, except that often there is only one set of correct behaviour.

>> It is a serious mistake to mix up "defined behaviour" and "correct
>> behaviour". Only defined behaviour can be correct, but you can't fix
>> incorrect code by making it defined behaviour.
> Of course! Because: incorrect programs can be written in a language in
> which no behavior is left undefined at all.
> I have bignum integers and rational numbers in Common Lisp, and a great
> numeric tower, yet still produce a garbage program where (sin x)
> appears where (cos x) ought to have been written, or whatever.
> Correct is with respect to the program's specification.


But how do you know what the program's specification is? The answer is
that it is the source code, interpreted according to the language
definition (augmented by implementation-dependent features, and any
implementation extensions or additional definitions - which may, of
course, include a definition of signed integer overflow behaviour). If
the source code does not make sense according to the language
specification, then the program has no meaningful specification - it is
nonsense code.

This is exactly the same as mathematics. The "square root" function,
for real numbers, is defined for non-negative numbers. If someone asks
you for "sqrt(-4)", then the question is meaningless. Any answer given
can be considered incorrect - equally, any answer given can be
considered correct.

Incorrect inputs to code are as wrong as incorrect code. In C, the
behaviour of signed integer addition is defined for all inputs that
don't overflow - giving it inputs that /do/ overflow is as wrong as
writing multiplication when you meant addition, or passing a negative
number to a square root function.

> Undefined can be correct. If the program's specification is
> "write a C program that dereferences a null pointer",
> then something like "*((char *) 0) = 42" might appear in
> that program. The program won't be "correct" if we delete that.

It won't be correct. Undefined behaviour can /never/ be correct. It is
- by definition - meaningless.

There is no way in standard C to write such a program. It could be
written with many C implementations, with the addition of a "volatile",
but not standard C.


> In C, "undefined behavior" is a large territory that encompasses
> documented extensions.

Sure. I have been careful to talk about things being undefined in
standard C, rather than any specific C implementation. For C many
implementations, most things that are undefined by standard C are still
undefined, but sometimes specific points are defined. And sometimes
there are options for defining additional behaviour (I've already
mentioned gcc's -fwrapv flag). There is no problem writing code that
relies on behaviour that is undefined by standard C but defined by the
implementation you are using - as long as you understand the code is not

It is also fair to say that most real programs depend at least partly on
behaviour defined by an implementation but not by the C standards.

However, if the code's behaviour is not defined by the C standards or
the implementation, then it is not defined behaviour.

> It only means "not required by the ISO C standard". For instance, using
> a platform-specific function is "undefined behavior". However, the
> platform supplies a definition of behavior; the program isn't defined by
> ISO C, but by the platform, so according to ISO C, it is nevertheless
> undefined.
>>> With the current situation, anyone wanting to avoid
>>> undefined behaviour (and don't we all?) has to write code like
>>> this for any signed operation:
>>> signed int sum;
>>> if (((si_b > 0) && (si_a > (INT_MAX - si_b))) ||
>>>     ((si_b < 0) && (si_a < (INT_MIN - si_b)))) {
>>>   /* Handle error */
>>> } else {
>>>   sum = si_a + si_b;
>>> }
>> No, they don't.
>> When I want to add two integers, I usually write "a + b". But then, I
> I certainly have code like that in the implementation of a higher
> level language that switches to bignum integers when the operands
> oveflow, and so returns the arithmetically correct result under all
> conditions, memory permitting.

That's fine. Your addition there is defined in a different way from
addition in C.

> The overflow check can be expressed more efficiently if we can rely
> on wraparound.

That may or may not be the case - it depends on the target, and
alternative solutions (such as implementation-specific features like
gcc's overflow builtins, or using unsigned integers which /do/ have
defined wrapping).

> A lot of people are doing computing with this sort of system nowadays,
> and a lot of them, or portions of their runtimes, are written in C.

If you are writing your code in a "C with the extra feature of having
defined behaviour on signed integer overflow", and only compile it with
suitable compilers (or compiler flags), then that's okay. But don't
call it correct C code and blame compilers for your own mistakes or
unwarranted assumptions.

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.