Re: Optimization techniques and undefined behavior

David Brown <david.brown@hesbynett.no>
Thu, 2 May 2019 17:27:19 +0200

          From comp.compilers

Related articles
[9 earlier articles]
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-04-30)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-01)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-01)
Re: Optimization techniques and undefined behavior anw@cuboid.co.uk (Andy Walker) (2019-05-02)
Re: Optimization techniques and undefined behavior martin@gkc.org.uk (Martin Ward) (2019-05-02)
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-02)
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-02)
Re: Optimization techniques and undefined behavior 847-115-0292@kylheku.com (Kaz Kylheku) (2019-05-02)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-02)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-02)
Re: Optimization techniques and undefined behavior auriocus@gmx.de (Christian Gollwitzer) (2019-05-02)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-03)
Re: Optimization techniques and undefined behavior martin@gkc.org.uk (Martin Ward) (2019-05-03)
[18 later articles]
| List of all articles for this month |
From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.compilers
Date: Thu, 2 May 2019 17:27:19 +0200
Organization: A noiseless patient Spider
References: <72d208c9-169f-155c-5e73-9ca74f78e390@gkc.org.uk> 19-04-021 19-04-023 19-04-037 19-04-039 19-04-042 19-04-045 19-04-049 19-05-003
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="67636"; mail-complaints-to="abuse@iecc.com"
Keywords: arithmetic, optimize, errors
Posted-Date: 02 May 2019 12:12:11 EDT
Content-Language: en-GB

On 01/05/2019 13:40, Bart wrote:
> On 30/04/2019 14:48, David Brown wrote:
>> On 29/04/2019 19:15, Bart wrote:
>>> On 29/04/2019 16:08, David Brown wrote:
>>>> On 29/04/2019 01:31, Bart wrote:
>>>
>>>>> then you don't want the compiler being clever about overflow.
>>>>
>>>> I /do/ want a result consistent with a single expression, or splitting
>>>> up the expression.
>>>
>>> Then the choice is between both ways giving you 1500000000, or both
>>> giving you -647483648.
>>
>> Let me repeat - I do not care what the results are here.  I don't care
>> if they are consistent with each other.  I don't care if they change
>> between runs of the compiler.  I don't care if the result is a pink
>> umbrella.
>
> So, there's a bug in the program, an inadvertent overflow. But rather
> than help in discovering that bug (such as giving the wrong results)
> gcc (and its clone Clang) conveniently pretend that such bugs cannot
> exist, and use that to give their code an unfair advantage:
>
>     #include <stdio.h>
>
>     int main(void) {
>         int x,y,z;
>
>         x=1000000000;
>         z=0;
>
>         for (int i=0; i<1000000000; ++i) {
>             y=x*3/3;
>             z+=y;
>             ++x;
>         }
>
>         printf("%d\n",x);
>         printf("%d\n",y);
>         printf("%d\n",z);
>     }
>
> Optimising compilers that don't take advantage of that undefined
> behaviour give timings here of 1.25 seconds (msvc) and 1.9 seconds
> (pelles c), with some taking much longer.
>
> The two that do, give timings of 0.22 seconds (gcc) and 0.05 seconds
> (clang).
>


Why are you worrying about timings of code with a bug? If you have a
bug, you want the tools to try to help find the problem - and making it
an illegal operation rather than a legal nonsense operation will allow
tools to help. Nobody cares how fast buggy code runs.


> However, there are a couple of problems: (1) they give different
> results; (2) they cheated by not doing a billion multiplies and divides.
>


It doesn't matter whether code is defined behaviour or not here - a good
compiler will do at least some of these calculations at compile time. A
/really/ good compiler - better than gcc - would do it /all/ at compile
time. Then it would give specific results for x, y and z in -fwrapv
mode, and a warning about overflow in normal UB mode.


And again, who cares about different results in different circumstances
from buggy code?




> Note this example also includes UB due to z+=y line, but I'm only
> interested in the bottom 32 bits, albeit signed, as a kind of checksum
> to compare with other compilers.
>


Then why not write correct code to do that?


> For that purpose, I need the final z value to be -101627306, which will
> match the same 32-bit arithmetic across languages** and in assembly.
>


It won't match many real languages - we have already seen that handling
of signed integer overflow varies a lot, with only a few doing two's
complement wrapping.


> I don't need it to be the mathematically correct 1499999999500000000,
> which seems to me what you'd like it to be. gcc/clang-O3 give me
> 1565039360 which is neither one nor the other.
>


I would not like it to be the mathematically correct answer - because
there is no defined correct answer for this C code. If a particular C
implementation defines signed integer overflow in some way, then I
expect the answer to be consistent with that definition - if it does
not, then I don't care what the result is.


> -------------------------------
>
> (** This is the above program auto-translated from C to one of my
> languages, which is sort of interesting, this being a compiler group.
>


Sure, your language is absolutely on-topic here.


But if you want to translate your language to C, you need to translate
it to C - not to what you think C ought to be. Given that you want your
own language to have wrapping semantics on integer overflow (hey, it's
your language - you define it in a crazy way if you want), then you
can't translate the source expressions into exactly the same thing in C.
    You need to write C code that means the same as the original.


If you didn't have such a knee-jerk allergy to the C preprocessor, I
could show you an efficient and safe way to do that.




If you are translating from C to your own language, then you can do it
directly - it is perfectly allowable to implement the undefined
behaviour by any definition that takes your fancy.


What does not make sense, of course, is to run tests in C with undefined
behaviour and expect consistent or testable results. That is just daft.




> Normally this is just for to help in viewing torturous C source code as
> the C semantics are not translated. But tweaked with the int32() cast to
> match C's intermediate calculations (usually 64-bit here), this actually
> works:
>
>     import clib
>
>     global function main()int32 =
>         var int32 x
>         var int32 y
>         var int32 z
>         var int32 i
>
>         x := 1000000000
>         z := 0
>         i := 0
>         while i<1000000000 do
>             y := int32(x*3)/3
>             z +:= y
>             ++x
>             ++i
>         od
>         printf("%d\n",x)
>         printf("%d\n",y)
>         printf("%d\n",z)
>         return 0
>     end
>
>     proc start =
>         stop main()
>     end
>
> The results match those of the non-gcc/non-clang C compilers (apart from
> speed which is poor).)


Who cares? The C code is buggy, so the results don't matter.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.