Re: Implementation dependent behaviour (WAS: Re: Programming language and IDE design)

Martin Ward <martin@gkc.org.uk>
Mon, 6 Jan 2014 19:23:05 +0000

          From comp.compilers

Related articles
Implementation dependent behaviour (WAS: Re: Programming language and martin@gkc.org.uk (Martin Ward) (2013-11-20)
Re: Implementation dependent behaviour (WAS: Re: Programming language Pidgeot18@verizon.net (=?UTF-8?B?Sm9zaHVhIENyYW5tZXIg8J+Qpw==?=) (2013-11-23)
Re: Implementation dependent behaviour (WAS: Re: Programming language gah@ugcs.caltech.edu (glen herrmannsfeldt) (2013-11-24)
Re: Implementation dependent behaviour (WAS: Re: Programming language kaz@kylheku.com (Kaz Kylheku) (2013-12-17)
Re: Implementation dependent behaviour (WAS: Re: Programming language martin@gkc.org.uk (Martin Ward) (2014-01-06)
Re: Implementation dependent behaviour (WAS: Re: Programming language martin@gkc.org.uk (Martin Ward) (2014-01-06)
Re: Implementation dependent behaviour (WAS: Re: Programming language ivan@ootbcomp.com (Ivan Godard) (2014-01-08)
Re: Implementation dependent behaviour (WAS: Re: Programming language kaz@kylheku.com (Kaz Kylheku) (2014-01-08)
Re: Implementation dependent behaviour (WAS: Re: Programming language ivan@ootbcomp.com (Ivan Godard) (2014-01-10)
Re: Implementation dependent behaviour (WAS: Re: Programming language gah@ugcs.caltech.edu (glen herrmannsfeldt) (2014-01-10)
Re: Implementation dependent behaviour (WAS: Re: Programming language ivan@ootbcomp.com (Ivan Godard) (2014-01-13)
[1 later articles]
| List of all articles for this month |

From: Martin Ward <martin@gkc.org.uk>
Newsgroups: comp.compilers
Date: Mon, 6 Jan 2014 19:23:05 +0000
Organization: Compilers Central
References: 13-11-025 13-11-028
Keywords: design
Posted-Date: 08 Jan 2014 00:55:10 EST

On Saturday 23 Nov 2013 at 07:06, Joshua Cranmer p' <Pidgeot18@verizon.net>
wrote:
> 1. The size of a pointer variable.
> 2. The size of an integer variable intended to be the same size as a
> pointer.
> 3. The reinterpretation of a function pointer as a data pointer.


One purpose of a high-level language is to abstract away from the
low-level details of the machine. For example, the same source code
should work on both big-endian and little-endian machines. Similarly,
high level code should not depend on the size of a pointer.
Overwriting code with data should simply not be possible (with the
possible exception of a JIT compiler). How many security holes are
caused by buffer overflows which allow a malicious user to overwrite
code or execute data?


A pointer is the data equivalent of a goto: using pointers one can
emulate any complex data structure (lists, trees, graphs etc.) just
as using a (conditional) goto one can emulate any control structure.
But it has long been recognised that having a suitable rich set of
control structures and eliminating gotos makes for better code.


> 4. The accuracy of non-fundamental floating point computations (e.g.,
> exp). I think it would be reasonable to constrain them, though.


I think you will find that most numerical analysts would disagree: the
whole purpose of the IEEE floating point standard is to eliminate this
class of implementation defined behaviour.


Suppose I update my compiler to the latest version, run my regression
tests and get different results. Is there a bug in the compiler? A bug
in my code? Hardware failure? Or is it just that the compiler writers
decided to change some random implementation dependent behaviour
(which I might not even be aware of)? Detecting the possibility of
such behaviour in my program might be a non-computable problem: so I
cannot even complain that the compiler gave no warning. The real
problem is with the language definition: which allowed implementation
dependent behaviour in the first place.


> 5. The contents of a memory location in the presence of a data race.
> 6. The correctness of floating point numbers in denormalized situations.
> 7. The order, size, etc. of functions (this is observable if you can
> take the address of a function and compare or subtract pointers).


You can also write self-modifying code: copy the code for function A
over the code for function B. Next time function B is called, the code
for function A gets executed. I would definitely regard self-modifying code
as an undesirable feature for any programming language.


> 8. The order of variables either in global memory, heap-allocated
> memory, or stack-allocated memory.
> 9. If you allow type punning, semantics that would require a specific
> endianness of types.


See above.


> The worst impacts of undefined or implementation-defined behavior are
> not because the underlying hardware is unreliable, it's because the
> optimizers gleefully trash the intent of your code in an attempt to make
> it faster. Strict aliasing in C is perhaps the worst offender in this
> regard.


Designing the language to prevent all aliasing allows all the optimisations
to take place without breaking any code.


> Or, you can get extremely inefficient computation on all CPUs. Suppose
> you had a language that required you to trap on arithmetic overflows,
> and that required you to trap at a very specific place in computation.
> This language means you have to prove range analysis on all variables to
> prove that they cannot overflow before you can do any code motion or
> code elimination--effectively negating the most powerful optimizations a
> compiler can do.
>
> If you allow a little bit of undefined behavior, if you let compilers
> choose to trap before the value would be observed as overflowed, you
> again allow code elimination and code motion without the range analysis,
> while achieving very nearly the same result.


If you eliminate the code, then no trap occurs at all!
If the trap was included specifically to prevent some disaster occurring,
then the disaster would occur. So we have two cases:


(1) The language allows the compiler to trap earlier than necessary,
or even to (effectively) ignore the overflow altogether.
A programmer who *needs* to know if the operation overflowed
then has to write a lot of complex explicit tests (while somehow
trying to stop the compiler from optimising them away)
in *every* place where an overflow test is needed.
In other words: the overflow trap mechanism is useless!


(2) The language enforces the trap to appear exactly when the overflow
occurs in the program as written. A programmer who needs the trap
just enables traps wherever needed: no extra code is required.
A programmer who doesn't care about overflows disables traps
where they are not needed: allowing the compiler to optimise
to its heart's content. The trap mechanism is now useful when needed
and doesn't get in the way when it is not needed.


--
Martin


Dr Martin Ward STRL Principal Lecturer and Reader in Software Engineering
martin@gkc.org.uk http://www.cse.dmu.ac.uk/~mward/ Erdos number: 4
G.K.Chesterton web site: http://www.cse.dmu.ac.uk/~mward/gkc/
Mirrors: http://www.gkc.org.uk and http://www.gkc.org.uk/gkc


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.