Re: Optimization techniques and undefined behavior

Bart <bc@freeuk.com>
Mon, 29 Apr 2019 18:15:59 +0100

          From comp.compilers

Related articles
Re: Optimization techniques david.brown@hesbynett.no (David Brown) (2019-04-25)
Re: Optimization techniques 847-115-0292@kylheku.com (Kaz Kylheku) (2019-04-26)
Re: Optimization techniques david.brown@hesbynett.no (David Brown) (2019-04-28)
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-04-29)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-04-29)
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-04-29)
Re: Optimization techniques and undefined behavior auriocus@gmx.de (Christian Gollwitzer) (2019-04-29)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-04-29)
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-04-30)
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-04-30)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-01)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-01)
Re: Optimization techniques and undefined behavior anw@cuboid.co.uk (Andy Walker) (2019-05-02)
Re: Optimization techniques and undefined behavior martin@gkc.org.uk (Martin Ward) (2019-05-02)
[26 later articles]
| List of all articles for this month |

From: Bart <bc@freeuk.com>
Newsgroups: comp.compilers
Date: Mon, 29 Apr 2019 18:15:59 +0100
Organization: virginmedia.com
References: <72d208c9-169f-155c-5e73-9ca74f78e390@gkc.org.uk> 19-04-021 19-04-023 19-04-037 19-04-039 19-04-042
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="66490"; mail-complaints-to="abuse@iecc.com"
Keywords: arithmetic, optimize
Posted-Date: 29 Apr 2019 22:44:56 EDT
In-Reply-To: 19-04-042
Content-Language: en-GB

On 29/04/2019 16:08, David Brown wrote:
> On 29/04/2019 01:31, Bart wrote:


>> then you don't want the compiler being clever about overflow.
>
> I /do/ want a result consistent with a single expression, or splitting
> up the expression.


Then the choice is between both ways giving you 1500000000, or both
giving you -647483648.


The former is going to be difficult, since the intermediate 32-bit value
has lost some information. The latter is very easy, and involves dumping
the UB nonsense.


> Questions about what the compiler will do with overflows, like how
> consistent it will be, are as sensible as asking how many miles per
> gallon you get from your car when it has no tires.  You would not drive
> your car without tires - that would be a mistake, a bug in your driving
> procedure.  I don't write signed integer expressions that overflow -
> barring bugs in my coding.  And thus I don't care what the compiler does
> about them, and have no interest in their consistency.


If the gcc people designed cars, either the car wouldn't have an engine
because, since you're always going to end up at your start point,
there's no point in driving it; or it wouldn't have any brakes since you
are never going to have an accident.


> I want the compiler to give me the right answer to valid questions - I
> don't expect it to give me any consistent answer to invalid questions.


What is the question? Hint: it's not the result of 1500000000*2/2, it's
the result of 1500000000*2/2 when the 1500000000 is represented as a
32-bit twos complement binary value, and intermediate calculations are
done to the same precision.


>> but I've just
>> tried 20 or so combinations of compilers and optimise flags, all give a
>> result of -647483648 - except gcc which gave 1500000000. And even gcc
>> gave -647483648 with some versions and some optimisation levels.
>>
>
> Do you understand what "optimising compiler" means?  It means the
> compiler should try to give you code that runs as efficiently as
> possible given valid inputs.  C does not impose any requirements on code
> efficiency, but compiler users certainly do - so a C compiler is not
> going to go out of its way to give poorer quality code.  So given "x * 2
> / 2;", a compiler will do one of two things - return "x" unchanged, or
> carry out the operations using the most obvious assembly instructions.
> A good compiler will thus give you 1500000000 in this case, as that is
> the most efficient implementation consistent with the source code.


And so it will be inconsistent with (in my tests) most other compilers.
My tests were done both with optimisation and without. clang-O3,
gcc81-O3, gcc81-03, and gcc51-O3 gave 1500000000.


All other compilers I tried, including VC, clang-O0 and gcc51-O0, gave
the -647483648 figure. As would my own compilers for other languages (if
using int32, but they now use int64 and the same behaviour is observed
when x is 6000000000000000000).


> It is not a correct answer for standard C signed arithmetic, because
> there is no correct answer.


This is the nub of the issue: *C* has decided that such arithmetic is
undefined. But this is exactly the same 32-bit operation that can be
done in a dozen other languages, probably on most machines that support
32-bit multiply, and most do not make it undefined.


So it is largely a peculiarity of C.




    It is not a correct answer in normal
> mathematics, or almost any real-world problem you might want to model.
> It is, however, correct if you have defined your signed arithmetic to be
> wrapping.  It is fine - but IMHO almost entirely useless and
> counter-productive - for a programming language to define signed
> arithmetic in that way.  C does not define it that way, but other
> languages (and particular C compilers) can do so.


This is contradictory - so a C compiler can choose to make something
that C as deemed undefined behaviour, defined?


>> It is certainly what you might expect on such hardware.
>>
>>> Why do you think a guaranteed wrong and meaningless answer is
>>> better than undefined behaviour?
>>
>> Is it really meaningless? Try the above using x=x*2. It will still
>> overflow and produce a result of -1294967296, apparently incorrect. But
>> print the same bit-pattern using "%u" format, and you get 3000000000 -
>> the right answer. You can predict what's going to happen, /if/ you can
>> predict what a compiler is going to do. Unfortunately with ones like
>> gcc, you can't.
>
> Again, it does not matter if you can predict what value you get if
> something is nonsensical.  It matters that you can predict what values
> are valid inputs, of course, but not what the outputs are when the
> inputs are invalid.  When garbage goes in, I don't care what colour of
> garbage comes out - if I care what is going to come out, I am careful
> about what I put in.


Sorry, but to me a result of 1500000000 would be garbage, as it is
highly misleading. If I didn't intend 1500000000*2/2 to overflow, but
the result was a perfect 1500000000, how would I know there was a bug?


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.