Re: Optimization techniques

David Brown <david.brown@hesbynett.no>
Tue, 23 Apr 2019 09:18:28 +0200

From comp.compilers

Related articles
[6 earlier articles]
Re: Optimization techniques rick.c.hodgin@gmail.com (Rick C. Hodgin) (2019-04-19)
Re: Optimization techniques gneuner2@comcast.net (George Neuner) (2019-04-19)
Re: Optimization techniques DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2019-04-20)
Re: Optimization techniques rick.c.hodgin@gmail.com (Rick C. Hodgin) (2019-04-19)
Re: Optimization techniques DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2019-04-20)
Re: Optimization techniques gneuner2@comcast.net (George Neuner) (2019-04-20)
*Re: Optimization techniques david.brown@hesbynett.no (David Brown)* (2019-04-23)**
Re: Optimization techniques david.brown@hesbynett.no (David Brown) (2019-04-23)
Re: Optimization techniques david.brown@hesbynett.no (David Brown) (2019-04-23)
Re: Optimization techniques rick.c.hodgin@gmail.com (Rick C. Hodgin) (2019-04-24)
Re: Optimization techniques martin@gkc.org.uk (Martin Ward) (2019-04-25)
Re: Optimization techniques david.brown@hesbynett.no (David Brown) (2019-04-25)
Re: Optimization techniques 847-115-0292@kylheku.com (Kaz Kylheku) (2019-04-25)
[22 later articles]

| List of all articles for this month |

From:	David Brown <david.brown@hesbynett.no>
Newsgroups:	comp.compilers
Date:	Tue, 23 Apr 2019 09:18:28 +0200
Organization:	A noiseless patient Spider
References:	19-04-004 19-04-009
Injection-Info:	gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="72194"; mail-complaints-to="abuse@iecc.com"
Keywords:	optimize
Posted-Date:	24 Apr 2019 10:18:20 EDT
Content-Language:	en-GB

On 19/04/2019 10:49, Kaz Kylheku wrote:
> On 2019-04-17, R. C. Hodgin <rick.c.hodgin@gmail.com> wrote:
>> Are there resources someone can point me to for learning more about
>> time-honored, long-established, safely applied, optimization
>> techniques for a C/C++ like language?
>
> Optimization only preserves the meaning of the program to the extent
> that the program has a defined meaning, as far as the compiler is
> concerned.

Yes.

>
> Thus, there is one huge pitfall in optimizing C or C++: there is a large
> space of possible programs that have unspecified or undefined semantics,
> and that programmers actually write.

Here you face the balance between giving good optimisation to people
that write correct code (i.e., no undefined behaviour) and giving "what
people probably expect" behaviour to people who don't follow those
rules. It is not an easy balance.

Most compilers solve this with flags - the programmer can choose between
"I know what I am doing - trust me and give me efficient code" and "I
don't know how C works, and think it does naÃ¯ve translations - don't
optimise". Most also give flags for "Let me know if you see me doing
something silly, even if it is legal". And some good ones give you
flags for "I know how C works, but I'd like to change it slightly".

There is no way to please all users with a compiler for C or C++.
Flags, and possibly pragmas, are the only way to do it. Some people
will still complain, of course - they want C to work they way /they/
imagine it should, they want the compiler to guess what their code means
even when it is undefined behaviour (i.e., has no meaning in C), and
they want the compiler to optimise perfectly based on that mind-reading.
But most should be happy enough if you give them choices.

I'd personally prefer if compilers had more warnings enabled by default.
That can be hard to change for an existing tool that needs
compatibility, but new tools could be freer here.

>
> If you can make your language "C/C++ like" without all the undefined
> behavior looming at every corner (i.e. not actually "C/C++ like" at all
> in a significant regard), then you've dodged what is probably the number one
> optimization pitfall.

And if you think that is possible, then I have a bridge to sell you.

There are some undefined behaviours in C that should be eliminated, as
they could be detected strictly at compile time with more sophisticated
tools and more advanced sharing of information between separate
translation units. For example, it should be possible to eliminate
mixups of parameter types and numbers for functions defined in one unit
and called in another. (It would help if old-style unprototyped calls
were banned, as they are in C++.)

There are some undefined behaviours in C that cannot be eliminated.
Accessing data via an invalid pointer, such as running over the end of a
buffer, is always going to be undefined. With enough effort and
inefficiencies, run-time checking can reduce these a bit, but not remove
them. You can't ever get all cases, no matter how inefficient you are
willing to be.

There are some things (mostly unspecified or implementation dependent,
rather than undefined behaviour) that could be tightened if you are
happy to restrict the targets for the language. C is used on many odd
systems, but a new C-like language could easily fix the sizes of integer
types and that sort of thing.

And there are undefined behaviours which could be given definitions, but
doing so is not actually a positive thing. The prime example is signed
integer overflow. Since overflowing your integers is almost always an
error in the code anyway, there are no benefits in giving it defined
behaviour. On the other hand, defining its behaviour restricts many
optimisations (especially simplification of expressions) and hinders the
compiler from warning you about said errors. A possible middle ground
would be to say that signed integer overflow gives unspecified results -
operations will always give you an integer, but if there is no correct
result, then any integer will do. This will allow many optimisations to
be done while blocking the most surprising ones (like time-travelling
optimisations).

Note also that just as executing undefined behaviours is always a bug in
the user code, bugs in the user code are also undefined behaviours when
you take a wider view - and good luck designing a programming language
in which no one has bugs in their code.

>
> For instance, don't have it so that the order of evaluation of
> function or operator arguments is unspecified. If you allow side-effects
> in expressions, specify when those effects take place in relation to
> everything else in the expression.

That is unspecified behaviour, not undefined behaviour. It could be
specified in the language, but that might limit certain optimisations -
you'd have to be careful.

>
> If you have clear ordering rules, then you honor them when optimizing:
> you rearrange the user's calculation order only when it can't possibly
> make a difference to the result that is required by the defined
> order.

Certainly. Optimisations should not change the meaning of correct code.

>
> Reordering arithmetic calculations has pitfalls. There are n! orders
> for adding together n numbers. Under floating-point, these all
> potentially return different results even if nothing overflows.
> You can't blindly rely on arithmetic identities to hold.
>

One of the advantages of signed integer arithmetic being undefined in C
is that for those types, you /can/ blindly rely on arithmetic
identities. It is a lot harder if you have defined behaviour.

IEEE floating point has complicated rules for rounding and ordering,
severely restricting optimisations. Many compilers have switches to
enable a "looser" mode for floating point which can give massive
speed-ups in fp-heavy code.

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Optimization techniques

David Brown <david.brown@hesbynett.no>Tue, 23 Apr 2019 09:18:28 +0200

David Brown <david.brown@hesbynett.no>
Tue, 23 Apr 2019 09:18:28 +0200