Re: specifications (was Re: Languages: The Bigger the Uglier)

bobduff@world.std.com (Robert A Duff)
27 Mar 1996 00:16:31 -0500

          From comp.compilers

Related articles
[11 earlier articles]
Re: specifications (was Re: Languages: The Bigger the Uglier) hbaker@netcom.com (1996-03-23)
Re: specifications (was Re: Languages: The Bigger the Uglier) bobduff@world.std.com (1996-03-24)
Re: specifications (was Re: Languages: The Bigger the Uglier) bobduff@world.std.com (1996-03-24)
Re: specifications (was Re: Languages: The Bigger the Uglier) jgj@ssd.hcsc.com (1996-03-25)
Re: specifications (was Re: Languages: The Bigger the Uglier) ok@cs.rmit.edu.au (1996-03-27)
Re: specifications (was Re: Languages: The Bigger the Uglier) fjh@cs.mu.OZ.AU (1996-03-27)
Re: specifications (was Re: Languages: The Bigger the Uglier) bobduff@world.std.com (1996-03-27)
Re: specifications (was Re: Languages: The Bigger the Uglier) tkb@sol.newnet.navy.mil (T. Kurt Bond) (1996-03-27)
| List of all articles for this month |
From: bobduff@world.std.com (Robert A Duff)
Newsgroups: comp.compilers
Date: 27 Mar 1996 00:16:31 -0500
Organization: The World Public Access UNIX, Brookline, MA
References: 96-02-226 96-03-164 96-03-169 96-03-182
Keywords: standards

Jeff Jackson <jgj@ssd.hcsc.com> wrote:
>Perhaps I'm being incredibly dense here (wouldn't be the first time),
>but I don't see why the "standard" implementation must execute all
>possible interleavings. It would only have to be random. Thus if you
>have a program that works on compiler A, but not on compiler B because
>of a dependency on the implementation's execution order (I'm not
>familiar enough with ADA to provide concrete example), ...


Well, you have to define what it means for the program to "work".
That is, how do you *know* it works on compiler A?


Suppose, on compiler A, the program prints out "37", and on compiler
B, it prints out "42". Now, remember, we're not trying to determine
whether the *program* is good or evil -- we're trying to say something
about the correctness of the compilers. Let's say compiler A is
defined to be the "standard" compiler, so we assume that "37" is the
"right answer". (Our moderator pointed out how dubious that
assumption is, but let's assume that anyway.)


The problem is that if the language has any non-determinism, then we
can't be sure whether "42" is a *wrong* answer, or is just a different
*right* answer. "37" isn't necessarily *the* right answer -- maybe
it's just *a* right answer. If compiler B prints out "37", then we
know it matches compiler A, so we're happy. But if it prints out
"42", we have no useful information. That's why I'm disputing the
original assertion, which was that you can tell for sure whether any
given compiler is correct for any given program, just by comparing the
results to the standard implementation. I'm not saying that a
standard implementation is a bad idea -- it can be useful -- I'm just
saying it doesn't totally solve the problem of unambiguously defining
programming language semantics.


>...running the
>program with the standard compiler would tell you that the program is
>not standard conforming and thus compiler B is not broken.


In any case, the fact that the program is not standard conforming does
*not* imply that compiler B is correct. I just means we don't know,
for this particular case.


>> True. But what if the langauge allows pass-by-reference or
>> pass-by-copy, depending on the whim of the compiler writer, but
>> neither one is considered "wrong".
>
>I'm going to make some guesses about what you mean by this.


Well, you guessed wrong. :-) I'll explain more below.


>...I'm
>guessing that you mean the standard states that if make a call like:
>
> f(a)
>
>Then function 'f' should not reference the actual paramter 'a' by any
>other way than via the formal parameter (after the formal parameter
>has been modified) since the value of 'a' would be undefined at that
>point because modifying the formal may modify the actual immediately,
>or not until 'f' returns, depending on the implementation choice. I
>don't see how this leads to combinatorial explosion. I would
>implement variables with descriptors, where one bit of the descriptor
>is "I'm undefined" and update that appropriately at run time.


Yes, if that were the semantics -- the program is *wrong* if it does
this evil thing -- then, yes, you can write a compiler that detects
the problem. It's not quite as simple as you say above, since you
have to worry about aliasing between records and their components, but
still, I agree, it could be done (inefficiently -- but we agreed that
the standard implementation doesn't have to be efficient).


But the semantics of Ada is not as you stated above. It is *not*
wrong to use aliasing in this evil way. It just introduces some
non-determinism. Having poked the global, the semantics of reading
the parameter (in Ada) is that you get the new value, or the old
value, or the program raises a particular exception.


>Now, if you had mentioned order of evaluation of operands of binary
>operators, I can see how that could be a combinatoric issue:
>
> (f()*g())*(h()*i())


OK, that's a better example.


>If the language says it's undefined which side of '*' gets evaluated
>first, one would ideally want to try all 24 orders. Instead, I would
>pick a random order every time. Either at run time or at compile
>time. While a single run does not tell you anything, you can
>determine a program is standard conforming with a given certainty by
>doing enough runs.


But you've just moved the non-determinism from the standard trying all
possibilities, to the user running the standard implementation many
times. Either way, you have to run the thing at least 24 times, in
your above example. And if the result of the above example is
combined in some way with another non-deterministic thing, then it's
24 times whatever. And so on.


>I suspect a (good) side effect of this discipline would be the
>elimination of undefined behaviours from standards.


I agree that languages should not have so many
undefined/non-deterministic behaviours. They're mostly there to allow
optimizers to do their thing. But I think you could achieve the
desired efficiency by defining the language so that the optimizer
knows more about the program, rather than letting it assume false
things about the program. (E.g. don't allow order of evaluation to be
random, but instead allow the compiler to know about the side effects
that might constrain order of evaluation.) Some people think
non-determinism is elegant. Perhaps, but I think it should be
minimized, to make debugging and porting easier.


However, I don't think you can entirely eliminate non-determinism. It
seems to me that in any language that supports parallelism or
timing-dependent operations, you're not going to be able to nail down
the exact behavior of the program. Consider "delay 1.0;" in Ada,
which means the current task waits 1.0 seconds (approximately --
actually, 1.0 seconds is a lower bound). Clearly, parallelism and
timing are important -- if they're not defined in your language,
you'll have to use some threads library, which doesn't solve the
problem but merely moves it to the library.


>The point another person made about the real world being
>Cray/DEC/IBM's implementation is well taken. I've always refered to
>the situation as there being a "written" standard and an "oral"
>standard (the situation being analogous to ancient Jewish legal
>practice where there was the written Law in the Torah, and the oral
>law, which eventually was written down in the Talmud).


Yeah.


>Another person noted that he doubted the standard implementation would
>be well debugged. Even if not, I think forcing the committee to have
>to actually implement what they are standardizing would go a long way
>towards removing ambiguity from the prose standard, even if the
>"standard" implementation proved to be a useless in all other aspects.


I agree wholeheartedly!


- Bob
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.