Re: Optimization techniques and undefined behavior

Bart <bc@freeuk.com>
Mon, 29 Apr 2019 00:31:40 +0100

          From comp.compilers

Related articles
Re: Optimization techniques david.brown@hesbynett.no (David Brown) (2019-04-25)
Re: Optimization techniques 847-115-0292@kylheku.com (Kaz Kylheku) (2019-04-26)
Re: Optimization techniques david.brown@hesbynett.no (David Brown) (2019-04-28)
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-04-29)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-04-29)
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-04-29)
Re: Optimization techniques and undefined behavior auriocus@gmx.de (Christian Gollwitzer) (2019-04-29)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-04-29)
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-04-30)
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-04-30)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-01)
[57 later articles]
| List of all articles for this month |

From: Bart <bc@freeuk.com>
Newsgroups: comp.compilers
Date: Mon, 29 Apr 2019 00:31:40 +0100
Organization: virginmedia.com
References: <72d208c9-169f-155c-5e73-9ca74f78e390@gkc.org.uk> 19-04-021 19-04-023 19-04-037
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="11585"; mail-complaints-to="abuse@iecc.com"
Keywords: optimize, standards, errors
Posted-Date: 28 Apr 2019 22:44:43 EDT
In-Reply-To: 19-04-037
Content-Language: en-GB

On 28/04/2019 22:49, David Brown wrote:
> On 26/04/2019 02:18, Kaz Kylheku wrote:


>> Problem is, all these propositions are not safe; they are based on
>> "the program has no mistake".
>
>
> /All/ programming is based on the principle of "the program has no
> mistake".  It is absurd to single out something like this as though it
> is a special case.
>
> If I want to double a number, and write "x * 3" by mistake, it is a bug.


Yes, a bug that will probably give the wrong answer. But I want a
predictably wrong answer - see below.




>  If I do arithmetic on signed integers, and they overflow, it is a bug.
>  The compiler is allowed to assume that when I write "x * 3", I mean
> what the C language means - multiply x by 3.  It is also allowed to
> assume that when I write some signed arithmetic, I mean what the C
> language means - give me the correct results when I give valid inputs,
> otherwise it is "garbage in, garbage out".
>
> Ask yourself, when would "x * 2 / 2" /not/ be equal to 2, given two's
> complement wrapping overflows?  It would happen when "x * 2" overflows.
>  For example (assuming 32-bit ints), take x = 1,500,000,000.  When you
> write "x * 2 / 2" with an optimising C compiler, the result is most
> likely 1,500,000,000 (but there are no guarantees).


If you write x=(x*2)/2 expecting a result consistent with:


        int y=(x*2); x=y/2;


then you don't want the compiler being clever about overflow. If
overflow has occurred, then you want to know about it by it giving a
result consistent with twos complement integer overflow.


You DON'T want the compiler giving you what looks like the right answer
and brushing the overflow under the carpet.


You also want a result that is portable across compilers, but I've just
tried 20 or so combinations of compilers and optimise flags, all give a
result of -647483648 - except gcc which gave 1500000000. And even gcc
gave -647483648 with some versions and some optimisation levels.


-------------------------
#include <stdio.h>


int main(void) {
      int x=1500000000;


      x=(x*2)/2;


      printf("%d\n",x);


}
-------------------------




>  When you use a
> "two's complement signed overflow" compiler, you get -647,483,648.  Tell
> me, in what world is that a correct answer?


It's not a correct answer when you are trying to do pure arithmetic. It
CAN be correct when doing it via a computer ALU that uses a 32-bit twos
complement binary representation.


It is certainly what you might expect on such hardware.


> Why do you think a guaranteed wrong and meaningless answer is
> better than undefined behaviour?


Is it really meaningless? Try the above using x=x*2. It will still
overflow and produce a result of -1294967296, apparently incorrect. But
print the same bit-pattern using "%u" format, and you get 3000000000 -
the right answer. You can predict what's going to happen, /if/ you can
predict what a compiler is going to do. Unfortunately with ones like
gcc, you can't.


> Ask yourself, when would "x + 1 > x" not be true?  When x is INT_MAX and
> you have wrapping, x + 1 is INT_MIN.  Ask yourself, is that the clearest
> and best way to check for that condition - rather than writing "x ==
> INT_MAX" ?  When does it make sense to take a large positive integer,
> add 1, and get a large /negative/ integer?


You get funny things happening with unsigned integers too; try:


-------------------
      #include <stdio.h>
      int main(void) {
          unsigned int x=1500000000;


          x=(x*4)/4;


          printf("%u\n",x);
      }
-------------------


This displays 426258176 rather than 1500000000.


So why is a 'wrong and meaningless answer' OK, in this case, just
because C deems it to be defined behaviour?


This marginalisation of signed integer overflows is out-dated, now that
every relevant machine is going to behave the same way.



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.