Re: Optimization techniques and undefined behavior

Andy Walker <anw@cuboid.co.uk>
Sat, 4 May 2019 10:45:19 +0100

          From comp.compilers

Related articles
[20 earlier articles]
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-03)
Re: Optimization techniques and undefined behavior martin@gkc.org.uk (Martin Ward) (2019-05-03)
Re: Optimization techniques and undefined behavior anw@cuboid.co.uk (Andy Walker) (2019-05-03)
Re: Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-03)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-03)
Re: Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-03)
Re: Optimization techniques and undefined behavior anw@cuboid.co.uk (Andy Walker) (2019-05-04)
Re: Optimization techniques and undefined behavior gneuner2@comcast.net (George Neuner) (2019-05-04)
Re: Optimization techniques and undefined behavior gneuner2@comcast.net (George Neuner) (2019-05-04)
Re: Bounds checking, Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-05)
Re: Bounds checking, Optimization techniques and undefined behavior DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2019-05-05)
Re: Bounds checking, Optimization techniques and undefined behavior gneuner2@comcast.net (George Neuner) (2019-05-05)
Re: Bounds checking, Optimization techniques and undefined behavior anw@cuboid.co.uk (Andy Walker) (2019-05-06)
[34 later articles]
| List of all articles for this month |

From: Andy Walker <anw@cuboid.co.uk>
Newsgroups: comp.compilers
Date: Sat, 4 May 2019 10:45:19 +0100
Organization: Not very much
References: 19-04-021 19-04-023 19-04-037 19-04-039 19-04-042 19-04-044 19-04-047 19-05-004 19-05-006 19-05-016 19-05-020 19-05-024
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="76703"; mail-complaints-to="abuse@iecc.com"
Keywords: design, errors, comment
Posted-Date: 04 May 2019 18:31:17 EDT
Content-Language: en-GB

On 03/05/2019 23:10, Bart wrote:
> I just haven't found overflow on numbers the huge problem that people
> say it is.
> I've spent 10 minutes adding overflow checks (for + - * ops on signed
> int64) on my latest compiler, and tested a range of applications.
> Nothing triggered it, except a tiny part of a compiler [...].


Well, of course it *shouldn't* be triggered at all in tried and
tested code. The value comes during development, when the bug is found
as it occurs, not when you eventually print some results and they're
not what you expect. They're also less likely if 64-bit integers are
your default [many of us were brought up on 16- or 32-bit integers]
and if your applications are not working out [eg] large combinatorial
problems. Not just integer overflow, of course; also floating point.
In my days as an astrophysicist, it was easy to generate some very
large numbers from things like the mass of a star [in grams], distance
to a planet [cms], the speed of light [cm/s], .... We regularly found
bugs in the floating-point accumulator, because it hadn't been that
well tested with numbers around 10^100 [or 10^-100].


> The overflow check I did above is just one x64 instruction (jo <addr>)
> after each add, sub or mul. Is that the sort of hardware support you mean?


Well, it's one way, tho' it may be significantly expanding the
size of your code and its run time [depending on the application]. The
better way, really, is for overflow to "trap", which costs you virtually
nothing in either code or execution. If an overflow flag is "persistent"
then you may need to clear it before expressions as well as testing it
during and after them.


[...]
> C is just a mess; it has arrays of sorts, but people generally use raw
> pointers without associated bounds. Maybe that's one reason why your C
> didn't have it. Or did it somehow manage it if enabled?


This isn't really a problem with C, the language. It's clear in
the reference manual right back to K&R C and in the various standards
that pointers always have associated bounds. It's just that the K&R
compiler with Unix didn't check, perhaps for decent reasons, and that
the relatively few attempts to produce compilers that do have not been
all that successful. You can't do the checking unless every pointer
"knows" which object it is supposed to be pointing into, which rather
implies "fat" pointers, which makes all pointer operations take longer
and require more code. It very likely means that arrays too need to
be implemented differently, with proper descriptors. As discussed over
in "comp.lang.misc" recently, that's not a Bad Thing; it gives you
extra facilities virtually free, as well as the added security. But
it does result in larger executables and slower execution -- unless you
really do have hardware support [cf Burroughs] -- so historically it's
not been popular.


I no longer recall how tight a squeeze K&R C was on the 11/34;
we relatively soon upgraded to an 11/70, with full 64K instruction and
64K data spaces, which would have been ample.


--
Andy Walker,
Nottingham.
[The K&R C compiler fit in about 12K words, but there was a lot of
inventive to keep the code small for other reasons. Re hardware
bounds checking, take a look at the MPX feature in recent Intel
processors which more or less puts fat pointers into hardware.
No idea how widely used it is. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.