Re: Undefined Behavior Optimizations in C

David Brown <david.brown@hesbynett.no>
Fri, 6 Jan 2023 16:12:25 +0100

          From comp.compilers

Related articles
Undefined Behavior Optimizations in C lucic71@ctrl-c.club (Lucian Popescu) (2023-01-05)
Re: Undefined Behavior Optimizations in C spibou@gmail.com (Spiros Bousbouras) (2023-01-05)
Re: Undefined Behavior Optimizations in C gah4@u.washington.edu (gah4) (2023-01-05)
Re: Undefined Behavior Optimizations in C anton@mips.complang.tuwien.ac.at (2023-01-06)
Re: Undefined Behavior Optimizations in C anton@mips.complang.tuwien.ac.at (2023-01-06)
Re: Undefined Behavior Optimizations in C david.brown@hesbynett.no (David Brown) (2023-01-06)
Re: Undefined Behavior Optimizations in C gah4@u.washington.edu (gah4) (2023-01-06)
Re: Undefined Behavior Optimizations in C gah4@u.washington.edu (gah4) (2023-01-06)
Re: Undefined Behavior Optimizations in C spibou@gmail.com (Spiros Bousbouras) (2023-01-07)
Re: Undefined Behavior Optimizations in C 864-117-4973@kylheku.com (Kaz Kylheku) (2023-01-09)
Re: Undefined Behavior Optimizations in C 864-117-4973@kylheku.com (Kaz Kylheku) (2023-01-09)
Re: Undefined Behavior Optimizations in C david.brown@hesbynett.no (David Brown) (2023-01-10)
[23 later articles]
| List of all articles for this month |
From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.compilers
Date: Fri, 6 Jan 2023 16:12:25 +0100
Organization: A noiseless patient Spider
References: 23-01-009 23-01-011 23-01-012
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="30902"; mail-complaints-to="abuse@iecc.com"
Keywords: optimize, semantics
Posted-Date: 06 Jan 2023 12:03:27 EST
In-Reply-To: 23-01-012
Content-Language: en-GB

On 06/01/2023 01:22, gah4 wrote:
> On Thursday, January 5, 2023 at 10:13:08 AM UTC-8, Spiros Bousbouras wrote:
>> On 5 Jan 2023 10:05:49 +0000
>> "Lucian Popescu" <luc...@ctrl-c.club> wrote:
>
>>> I'm currently working on an academic project that analyzes the speedup gain of
>>> Undefined Behavior Optimizations in C.
> (snip)
>
>>> To test the theory that the UB Optimizations introduce more risks than
>>> speedup gains,
>
>> Isn't this comparing apples and oranges ?
>
> Probably.
>
> You can quantify speed-up, but it is harder to quantify risk.
>
> You might be able to quantify debug time, and how much longer
> it takes to debug a program with such behavior.
>
> Most important when debugging, is that you can trust the compiler to
> do what you said. That they don't, has always been part of
> optimization, but these UB make it worse.


The trouble with undefined behaviour is that, in general, you cannot
trust the compiler to "do what you say" because it cannot know what you
have said.


A computer language like C is defined by its standard. This says what
particular combinations of characters in the source code actually mean.
    If what you write does not fit the specified and documented patterns
(or the pattern is explicitly labelled "undefined behaviour"), then it
does not mean anything at all.


So if you write "give me some prime numbers" as your C code, it means
nothing and the compiler can't help you. If you write "int * p = 0; int
x = *p;", it means nothing and the compiler can't help you.


(Well, the compiler might be able to give helpful error messages!)


When you write "x = *p;", you are saying to the compiler "It is a fact
that p is valid pointer to data of a type compatible to *p, all the
constraints required for the assignment operation are met, there is no
partial overlap between x and *p, there are no data races, and no range
errors converting any floating point values. Given that, act as though
the value of x is now equal to *p after any required conversions."


You might /think/ you are saying "read the value at address p, and store
it in the memory reserved for x".




When you write "x = (y * 30) / 15;" (for "int" x and y), you might
/think/ you are asking the compiler to multiply by 30 and then divide by
15. But you are actually telling it that there would be no overflow if
it were to multiply y by 30, and thus it can use simple mathematical
equalities to reduce the expression to "x = y * 2;".




You can /always/ trust the compiler to do what you said, barring
occasional bugs in the compiler. What you cannot always do is trust the
programmer to know what he or she /actually/ said, or to write what he
or she meant.




And outside of flags that change the language semantics, such as gcc's
-fwrapv or -fno-strict-aliasing, enabling or disabling optimisations
does not change the meaning of the code. It might affect how code is
generated (and therefore its speed and size), but if the behaviour is
different then it is because your code did not say what you thought it said.




And yes, I know debugging optimised code can be difficult. Sometimes
that means adding extra "volatile" qualifiers, or "asm volatile("" :::
"memory");" fences, in order to have good breakpoint spots or debugging
information.



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.