Re: language design tradeoffs

jlg@cochiti.lanl.gov (Jim Giles)
Wed, 23 Sep 1992 21:19:24 GMT

          From comp.compilers

Related articles
[28 earlier articles]
Re: language design tradeoffs jlg@cochiti.lanl.gov (1992-09-21)
Re: language design tradeoffs raveling@Unify.com (1992-09-21)
Re: language design tradeoffs alvin@eyepoint.com (1992-09-22)
Re: language design tradeoffs kcoppes@aardvark.den.mmc.com (1992-09-22)
Re: language design tradeoffs dmason@plg.uwaterloo.ca (1992-09-22)
Re: language design tradeoffs tmb@arolla.idiap.ch (1992-09-23)
Re: language design tradeoffs jlg@cochiti.lanl.gov (1992-09-23)
Re: language design tradeoffs bromage@mullauna.cs.mu.OZ.AU (1992-09-24)
Re: language design tradeoffs alvin@eyepoint.com (1992-09-24)
Re: language design tradeoffs rob@hoster.eng.ohio-state.edu (1992-09-24)
Re: language design tradeoffs chased@rbbb.Eng.Sun.COM (1992-09-25)
Re: language design tradeoffs os360051@wvnvms.wvnet.edu (1992-09-26)
Re: language design tradeoffs plyon@emx.cc.utexas.edu (1992-09-26)
| List of all articles for this month |

Newsgroups: comp.compilers,comp.human-factors
From: jlg@cochiti.lanl.gov (Jim Giles)
Organization: Los Alamos National Laboratory
Date: Wed, 23 Sep 1992 21:19:24 GMT
References: 92-09-048 92-09-130
Keywords: parse, design

jlg@cochiti.lanl.gov (Jim Giles) writes:
      In the above, you can't automatically detect the common error 'missing
      statement terminator'. In fact, if you make that common error, you will
      have a `correct' program which does something unintended - a fault which
      may be hard to find in a large program.


In article 92-09-130, nickh@CS.CMU.EDU (Nick Haines) writes:
> But in SML, making an appropriate change would mean getting rid of
> currying, and therefore of first order functions. Are you serious?


I was talking about imperative languages which have *statements*, each of
which performs at least one side-effect (or the statement is, by
definition, a no-op). In such languages, partial application of functions
(even if the languages allowed it - which I think they should in future)
should probably not be done with currying anyway. Now, as for SML (or
Haskell, which I'm more familiar with), this particular problem is one
that I *do* consider a deficiency.


> If "f x" is a valid expression, then "f x y" _must_ be, for the language
> to be at all natural. And so must be "f x y z", which immediately gives us
> the problem: "f x y z" might be missing an expression separator (;)
> between the x and the y. If the type of f is a subtype of
>
> 'a -> ('b -> 'c) -> 'b -> 'd
>
> then the typechecker (salvation of all SML hackers) will not catch this
> error, since (f x) will be a valid expression and so will (y z).


Now, in Haskell, "f x y z" would be a legal expression as you've stated.
However, it's hard to imagine what *single* error could make it into two
distinct expressions. In Haskell, the grammar doesn't allow expressions
just to follow one another. There must be something else going on. What
does it mean to have "f x; y z" in SML? Are SML expressions allowed
side-effects? If so, I'd oppose currying and juxtaposition as the
function application syntax. I'd prefer "f(x,,); y(z)" vs. "f(x,y,z)" as
the two expressions you say are difficult to distinguish. Now it's real
clear when the semicolon is left out: "f(x,,) y(z)" can be easily seen to
be wrong.


> [... about long expressions, continuation markers etc ...]
>
> Yes, I do mean the dreaded continuation marker. See my earlier article on
> this question. Continuation is rare and no one does it unless there is no
> other choice. What's wrong with an explicit marker which emphasizes that
> something unusual is afoot?
>
> What's wrong is that in some languages (such as SML) _very_ long
> expressions are the norm. Not just one-liners, but twenty-liners. [...]


Tell me what syntactic entity in SML corresponds to the concept
"statement", and I'll tell you where the continuation markers belong. In
an imperative language, the choice is simple: statements are the smallest
objects which *always* cause side-effects or alter control flow or cause
declaration - which is a side-effect on the *context* anyway (the word
"always" may be deleted here if the language disallows functions from
having side-effects). So a stand-alone procedure call *must* be a
statement because it must have a side- effect (otherwise its result is
just lost). Assignment is obviously a statement, it alters the store: a
side-effect. IF, ELSE, WHILE, END WHILE, etc. are all statements, they
alter control flow. Expressions (even in C) don't *always* cause
side-effects, so they aren't statements. Declarations may be single
statements (like giving the type of a simple variable), or they can
consist of a header statement followed by a block of statements which
constitute the definition (preferrably followed by an explicit end
statement as well).


Now, in Haskell *nothing* ever has a side-effect. And there's no explicit
control flow. So, the only things I would regard as statements in Haskell
are declarations - which can get very long. I would still recommend that
distinct ones be distinguishable from each other without a separator, that
such a separator be required anyway (as a unobtrusively redundant check on
correctness), and that comments should be terminated by EOL. However, for
a language like Haskell, I would relax my insistence on EOL being the
statement terminator, and on explicit continuation. This is only because,
unlike imperative languages, Haskell statements (declarations) can be
*very* long. My posted *rules* are only guidlines after all - not federal
laws - other considerations may override. All language design is an
exercise in compromise between conflicting goals.


|> [...] Here's a short example from code I've been working on lately:
|> fun skein_uncheckpoint () =
|> case (SID.get_venari sid) of
|> NONE => (UndoC.c_uncheckpoint();
|> raise Skein.Abort) (* indicates bug *)
|> | SOME SID.OTHER =>
|> (UndoC.c_uncheckpoint();
|> raise Skein.Abort) (* indicates bug *)
|> | SOME (SID.UNDO p) =>
|> (if within_undo (SID.get_parent sid) then
|> UndoC.c_anti_inherit (p, active)
|> else ();
|> UndoC.c_uncheckpoint ();
|> UndoC.restore_state p;
|> UndoC.forget_state active
|> )


It's interesting, but everywhere you put an EOL here delimits what *I*
would call a statement - except for the two case labels that stand alone
on lines (and, even those I would accept as beginning a block with an
empty first statement). Even the close parenthesis on the last line would
be better spelled as "end case" and treated as a statement. I would
regard the first line as a header line for the function: that is, a
declaration statement. SML looks very much like a language where I
wouldn't recommend juxtaposition as the function invocation syntax.


(Of course, your use of parenthesis around the cases is analogous to the
"compound statement" stuff, which I would oppose even in a functional
language.)


My guidlines for determining what a statement *is* seem very appropriate.
Now, you *may* also commonly have longish side-effect free expressions in
SML (not evidenced here) which force a compromise on EOL. But the other
`rules' I posted still would apply. In particular, I think it's important
that the syntax be designed so that omitted statement separators
(semicolons in this case) be automatically detectable.


--
J. Giles
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.