Re: Undefined Behavior Optimizations in C

David Brown <david.brown@hesbynett.no>
Wed, 11 Jan 2023 14:20:49 +0100

          From comp.compilers

Related articles
[9 earlier articles]
Re: Undefined Behavior Optimizations in C 864-117-4973@kylheku.com (Kaz Kylheku) (2023-01-09)
Re: Undefined Behavior Optimizations in C 864-117-4973@kylheku.com (Kaz Kylheku) (2023-01-09)
Re: Re: Undefined Behavior Optimizations in C jonathanchesterfield@gmail.com (Jon Chesterfield) (2023-01-10)
Re: Undefined Behavior Optimizations in C david.brown@hesbynett.no (David Brown) (2023-01-10)
Re: Undefined Behavior Optimizations in C gah4@u.washington.edu (gah4) (2023-01-10)
Re: Undefined Behavior Optimizations in C tkoenig@netcologne.de (Thomas Koenig) (2023-01-11)
Re: Undefined Behavior Optimizations in C david.brown@hesbynett.no (David Brown) (2023-01-11)
Re: Undefined Behavior Optimizations in C david.brown@hesbynett.no (David Brown) (2023-01-11)
Re: Undefined Behavior Optimizations in C gah4@u.washington.edu (gah4) (2023-01-11)
Re: Undefined Behavior Optimizations in C 864-117-4973@kylheku.com (Kaz Kylheku) (2023-01-12)
Re: Undefined Behavior Optimizations in C Keith.S.Thompson+u@gmail.com (Keith Thompson) (2023-01-12)
Re: Undefined Behavior Optimizations in C tkoenig@netcologne.de (Thomas Koenig) (2023-01-12)
Re: Undefined Behavior Optimizations in C antispam@math.uni.wroc.pl (2023-01-13)
[16 later articles]
| List of all articles for this month |

From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.compilers
Date: Wed, 11 Jan 2023 14:20:49 +0100
Organization: A noiseless patient Spider
References: 23-01-027 <sympa.1673343321.1624.383@lists.iecc.com> 23-01-031
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="95876"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 11 Jan 2023 18:11:50 EST
Content-Language: en-GB
In-Reply-To: 23-01-031

On 10/01/2023 11:46, Jon Chesterfield wrote:
>> So before we decide if UB optimizations are actually allowed by the
> standard we need to decide what "ignoring the situation completely
> with unpredictable results" actually means.
>
> [1] https://port70.net/~nsz/c/c89/rationale/
>
> Lucian
>
> WG14 are aware of UB optimising compilers and could have steered away from
> this path, but haven't. It's been decades now. The pointer provenance work
> seeks to apply aliasing rules even more aggressively. GCC and clang are
> both pursuing faster codegen via exploiting undefined behaviour.
>
> C, the WG14 ISO defined language, as implemented by the primary open source
> toolchains, is thus unfit for my purposes. I'm not clear what use that
> language has.


It seems to be very popular, so many people find it fit for their
purposes. (I certainly find it, along with C++, a good fit for my
low-level small-systems embedded programming, and I am quite happy with
"UB optimisations" as you call them.) But some people don't like it,
which is fair enough. And certainly no one thinks either the language
or the tools are perfect.


Some people want a language that is mostly like C, except for certain
features - and accessing objects in memory using different pointer types
is a common request. This is why both gcc and clang (and a few other
compilers) have a flag that gives you this behaviour
"-fno-strict-aliasing". I always find it ironic that the compilers that
some people complain "doesn't do what I want" or "doesn't do what old
compilers did" are precisely the compilers that give you these options.


>
> C, the typed assembler of ye olde times, is a profoundly useful language.


It's a myth. It never existed. There has simply been a steady
improvement in the optimisation of correct code as compilers have got
more sophisticated. There are compilers that document and define
behaviour for certain things that are undefined behaviour in the C
standards, but I have never heard of a compiler that claims to
understand the programmers' intentions even when they write incorrect
code.


C was designed from day one to be a high-level language, not an
assembler of any sort. Limitations of weaker earlier compilers does
not mean the language was supposed to work that way.


I first used a C compiler that optimised on the assumption that UB
didn't happen some 25 years ago. (In particular, it assumed signed
integer arithmetic never overflowed.)


> One just can't use GCC or clang to build it reliably.




You mean newer tools treat your code bugs in different ways from older
tools? There's a solution for that.


> It annoys me intensely that the type aliasing rules capture something a
> whole program optimising compiler can usually work out for itself anyway,
> while preventing me from reading 128bit integers from the same memory I
> fetch_add 32bit integers into.
>


It annoys /me/ intensely that people complain about this sort of thing,
and yet apparently haven't bothered to read the compiler manuals to see
how to get the effects they want. Compile with "-fno-strict-aliasing",
or (better, IMHO) add this to your code:


#pragma GCC optimize ("-fno-strict-aliasing")


Now, if you want to complain that the gcc documentation is not great, or
that flags like this should be documented along with the standards flags
rather than optimisation flags, I'll happily agree. (I don't know if
clang does better here.) But don't complain that the compiler is a problem.




And there are other ways to handle this in gcc. Use "may_alias" types.
    Or use volatile accesses. Or use memcpy(). Or use unions.




There are two /real/ problems here. One is that C is not, and never has
been, the language that some people think it is - and thus they get
frustrated when they find out there code is not as correct as they
thought. A second is that there are weak compilers out there that on
the one hand lull developers into a false understanding of the language
due to their limited code optimisations, and on the other hand make safe
alternatives such as "memcpy" highly inefficient on their tools.


What this means is that different compilers, including gcc and clang,
are perfectly capable of generating code that efficiently mixes accesses
of different kinds to the same object. But the details of the code you
write to get the effects are different - C is not as portable here as it
should be. For code that needs to work well on multiple toolchains, you
quickly end up with a header that has conditional compilation and macros
that vary depending on the compiler in use. That is ugly and awkward,
but I know of no better way.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.