Re: Optimization techniques and undefined behavior

Bart <bc@freeuk.com>
Fri, 3 May 2019 00:48:28 +0100

From comp.compilers

Related articles
[14 earlier articles]
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-02)
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-02)
Re: Optimization techniques and undefined behavior 847-115-0292@kylheku.com (Kaz Kylheku) (2019-05-02)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-02)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-02)
Re: Optimization techniques and undefined behavior auriocus@gmx.de (Christian Gollwitzer) (2019-05-02)
*Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart)* (2019-05-03)**
Re: Optimization techniques and undefined behavior martin@gkc.org.uk (Martin Ward) (2019-05-03)
Re: Optimization techniques and undefined behavior anw@cuboid.co.uk (Andy Walker) (2019-05-03)
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-03)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-03)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-03)
Re: Optimization techniques and undefined behavior anw@cuboid.co.uk (Andy Walker) (2019-05-04)
[41 later articles]

| List of all articles for this month |

From:	Bart <bc@freeuk.com>
Newsgroups:	comp.compilers
Date:	Fri, 3 May 2019 00:48:28 +0100
Organization:	virginmedia.com
References:	19-04-021 19-04-023 19-04-037 19-04-039 19-04-042 19-04-044 19-04-047 19-05-004 19-05-006
Injection-Info:	gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="8599"; mail-complaints-to="abuse@iecc.com"
Keywords:	arithmetic, errors, design, comment
Posted-Date:	02 May 2019 21:39:25 EDT
Content-Language:	en-GB

On 02/05/2019 11:29, Andy Walker wrote:
> On 01/05/2019 13:53, Bart wrote:
>> If you have two unknown values A and B, and need to multiply, you won't
>> know if the result will overflow.
>
>    int A := ..., B := ...;
>    int C := ( A = 0 | 0 |: abs B <= maxint % abs A | A*B | error (...);
> 0 );

That sounds unmanageable (think about checks here: a*a+b*(c*d), and how
you might proceed from the error if not aborting) and inefficient if one
multiply turns into half a dozen operations including a divide.

If such overflow checks need to be routinely done, then it really needs
language support. Otherwise it's probably best handled by a guard
function similar to what someone posted:

          if multiply_overflow(a,b) then
                    print "Overflow"
          else
                    c := a * b # requires that a and b haven't changed
          end

This at least isolates the actual expression we want and leaves it readable.

>     Of course, in the old days, compilers used to build in range
> checks on array indices and overflow checks on all arithmetic.

Did they? Certainly not any of mine (not really practical on tiny 8-bit
computers on which both the compiler and the program being developed had
to run).

    A few
> still do, esp interpreters, but the God of Speed dictates that most
> languages in most circumstances don't. We see the results in the huge
> amount of malware that exploits that failure.

I don't know how much that would help. And I think that if a program can
go seriously wrong through unchecked input, then that's a failure in
proper validation. It's rather sloppy to rely on a runtime check put
their by a compiler.

During development, yes, but perhaps not after a program is supposed to
be working.

(My interpreters (attached to apps) did do a number of runtime checks,
but one difference there is that users could write their own programs.
So that constitutes user input and therefore can't be fully trusted.)

>> In the example posed, you have the additional problem that the input can
>> be this:
>>     P5
>>     389000000000000000000000000000 9200000000000000000000000000
>> with both dimensions exceeding int64.
>
>     My own favourite language will throw an "on value error" exception
> if you try to read those values [or any other unsuitable strings] into an
> integer variable. By default, that will terminate your program with
> suitable error messages/diagnostics, but you can substitute your own
> "on value error" procedure if you want to print a "Don't be daft, please
> type sensible values" message and try again.

My favourite language would actually return those numbers as a type that
doesn't have any meaningful 'intmax' value:

          sreadln(
            "389000000000000000000000000000 9200000000000000000000000000")
          read a, b
          println a, a.type
          println b, b.type

Output is:

          389000000000000000000000000000 <longint>
          9200000000000000000000000000 <longint>

So it gets past that hurdle, but then might be obliged to try creating
an image with 3578800000000000000000000000000000000000000000000000000000
pixels.

(I'm in the process of incorporating such a 'bignum' type into my main
language. It's handy for UI code where performance doesn't matter.)

[There have been plenty of compilers that did bound checking. Back in
the 1960s and 1970s the WATFOR Fortran compilers, originally for the
7040 and later IBM 360/370 and later PDP-11 did bound checking and
also checks for uninitialized variables.

IBM had two PL/I compilers, the checkout compiler that generated
interpreted code with extensive runtime checks and the optimizing
compiler that generated fast machine code. It was possible if painful
to compile part of your program with one and part with the other and
link the code together. Oh, and each compiler ran in 44K bytes of
RAM. Take that, 8-bit micros. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Optimization techniques and undefined behavior

Bart <bc@freeuk.com>Fri, 3 May 2019 00:48:28 +0100

Bart <bc@freeuk.com>
Fri, 3 May 2019 00:48:28 +0100