Re: Definable operators

Craig Burley <burley@tweedledumb.cygnus.com>
13 May 1997 22:49:26 -0400

          From comp.compilers

Related articles
[32 earlier articles]
Re: Definable operators rideau@ens.fr (Francois-Rene Rideau) (1997-05-08)
Re: Definable operators monnier+/news/comp/compilers@tequila.cs.yale.edu (Stefan Monnier) (1997-05-08)
Re: Definable operators burley@tweedledumb.cygnus.com (Craig Burley) (1997-05-08)
Re: Definable operators burley@tweedledumb.cygnus.com (Craig Burley) (1997-05-08)
Re: Definable operators Dave@occl-cam.demon.co.uk (Dave Lloyd) (1997-05-12)
Re: Definable operators mfinney@lynchburg.net (1997-05-12)
Re: Definable operators burley@tweedledumb.cygnus.com (Craig Burley) (1997-05-13)
Re: Definable operators burley@tweedledumb.cygnus.com (Craig Burley) (1997-05-13)
Re: Definable operators pjj@cs.man.ac.uk (1997-05-14)
Re: Definable operators jkoss@snet.net (1997-05-15)
Re: Definable operators genew@vip.net (1997-05-22)
Re: Definable operators mfinney@lynchburg.net (1997-05-22)
Re: Definable Operators burley@tweedledumb.cygnus.com (Craig Burley) (1997-05-30)
| List of all articles for this month |
From: Craig Burley <burley@tweedledumb.cygnus.com>
Newsgroups: comp.compilers
Date: 13 May 1997 22:49:26 -0400
Organization: Cygnus Support
References: 97-03-037 97-03-076 97-03-112 97-03-115 97-03-141 97-03-162 97-03-184 97-04-027 97-04-095 97-04-113 97-04-130 97-04-164 97-05-053 97-05-119 97-05-141
Keywords: syntax, design

Craig Burley <burley@tweedledumb.cygnus.com> asks:
> > Exactly _where_ do we disagree? ...




Dave Lloyd <Dave@occl-cam.demon.co.uk> writes:
> Well your last paragraph says it all...


> > Until we get away from seeing technical stupid-pet-trick stuff like
> > operator overloading as our salvation, we won't focus (as an
> > industry) on solving the real problems, which involve human factors
> > engineering, linguistic ergonomics -- basically, giving people
> > languages that let them say what they _know_ about their programming
> > problem, then _separately_ specify, where necessary, how to
> > translate those expressions into solutions (such as
> > implementations). Operator overloading is one of several serious
> > rat-holes we've gone down while bumbling towards the goal. --
>
> Operator overloading is NOT about lazy typing as you suggest.


I don't see where I suggested that.


I've been objecting to operator overloading being thought of as the
arbitrary use of lexemes for whatever the programmer thought they'd be
convenient for at the time of writing some code. E.g. "hey, I'd like
to do concatenation, I think I'll use `+' for that -- or `/'".


And, by extension, I've pointed out that, as long as we think of
operator overloading as one of several technical cure-alls, we won't
make ourselves design _proper_ language constructs to serve the
purposes so crudely served by things like overloading. E.g. C++ has
no "concatenate" operator, even though as a bit-twiddling programmer
(usually in C these days) I've often wanted it on integers. (Writing
assemblers, disassemblers, and so on, it'd be useful to actually
express what I mean using a real concatenate operator, instead of
using macros that did shifting, and so on.)


> It is not a "technical stupid-pet-trick". It is a powerful tool to
> reduce the apparent complexity of a problem as I first argued.


I should have been more clear -- I meant operator overloading as a
technical stupid-pet-trick in the sense of going beyond what an
operator would _normally_ mean. E.g. `+' means add -- the fact that
you can make it mean pretty much anything in C++ is a technical
stupid-pet-trick that results in people thinking it "obviously" means
concatenate when applied to character strings (which, as I've pointed
out before, is simply wrong -- "1" + "2" doesn't obviously mean "12",
to some people it'd obviously mean "3").


> I refuse to acknowledge any difference in principle between
> overloading a + b, add (a, b) or a.add(b) yet earlier you defend
> overloading of itself.


Go ahead and refuse to acknowledge them, but they exist. E.g. `add(a,
b)' doesn't imply conversion of type to most people the way `a + b'
does, e.g. if a is an integer and b is a double-precision
floating-point value.


In _principle_, `a + b' means "add a and b", and anyone who
understands the basics of math (expressed in Western notation) knows
that. In _principle_, `add(a, b)' means "the result of function `add'
given arguments a and b", and here, most people wouldn't be too
surprised if "add" inserted element a into list b -- which means list
b gets modified. And most of _those_ people _would_ be surprised if
`a + b' did that. In any case, few people would assume that a would
be first converted to b's type, or vice-versa, before the function
ever started up. Again, many of those expect that of `a + b'.


Infix notation is more than just about saving typing. It's also about
encapsulating some common assumptions about meaning and implementation
that are necessarily _not_ valid about the more general function-call
mechanism. (Not that I agree with all the _particular_ assumptions we
do and don't make about infix vs. function-invocation vs. message-
passing, but there are substantial differences between infix and these
other notations that exist in a substantial number of programmers
today, and these differences are more than just implementation details
-- they involve principle.)


But I am interested in knowing what makes you think there's absolutely
no difference worth acknowledging between overloading `a + b', `add(a,
b)', and `a.add(b)' while you _would_ see a difference between these
and `a = b'. What would that difference be? Or, if none, how about
between these and `a * b', or between these and `c + d'? At what
point does the notation legitimately mean something _principally_
different to you, and why?


(BTW, I've never been comfortable with the asymmetry of `a.add(b)' and
similar. Why does one operand get sent a message instead of the other
one, when they have equal weight in the computation and in the meaning
of the language expression? Seems to me `a.add(b)' doesn't imply
equal conversion weight as does `a + b' and certainly that it doesn't
imply commutivity as `a + b' should to everyone [except C++
programmers ;-].)


> C++ may make a pig's ear of it but the language is an abomination
> that I would never defend anyway. Algol 68 requires operators to be
> type-unambiguous. Fortran 90 requires operators to take intent(in)
> arguments but of course as with most modern imperative languages I
> can pass a read-only pointer and modify an indirected value.
> Haskell with one of the more powerful forms of overloading around is
> *functional* so most of your arguments fly out of the window. There
> is little doubt in my mind that functional programming is more
> 'proper' than imperative programming, yet I am not yet ready to
> trade in Algol 68 for Haskell because in this age imperative
> techniques are still too damned useful to let go of.


I have the feeling you're objecting to stuff I'm not trying to say,
and we're both against the kind of "operator overloading is a nifty
trick that is always a great way to solve all sorts of problems"
sentiment that seems to have inspired, and been further inspired by,
C++.


> Fortran 90 also allows the programmer to specify *where* an operator
> has been acquired from, e.g., use operator(+) from linear_operators
> (OK you can't specify which version, but that could easily be
> repaired - the RS Algol 68 compiler did so in the 70s, but is not
> much of a problem anyway with well designed modules). I have argued
> in the past that both sides of a module contract, export and import,
> should be specifed explicitly. You must also bare in mind that the
> requirements of prototyping and delivering production code are quite
> different and I and others have argued in the past for a tiered
> language where some features are deleted in production code (e.g.,
> default public in F90).


I agree with all that as well.


> I could give a detailed riposte to many of your other points, but I
> will spare our other readers and lump them into one more point:
> overloaded operators must be part of a well balanced language and
> should not be used inappropriately (and this is the programmer's
> decision really).


Exactly what I've been trying to say. I don't understand why you
think I've been saying something different, and why you also think
I've been defending operator overloading (which you seem to suggest
above).


> Where we agree is that there is still plenty of distance to cover in
> the software engineering effort. I suspect that you have been stung
> badly by C++ and are now reacting against all the tools that had
> blades sharp enough to cut you thinking instead to put guards around
> them. Fine, but do remember to leave some utility in the remaining
> tool. Training to use a tool safely is as important as training to use
> the tool for its purpose. I still prefer to leave such decisions in
> the hands of the humans - coding standards, project managers, peer
> code review, etc., while you are after the same kind of technical fix
> that you deplore when used to increase the expressivity of a language.


You've made a lot of extremely wrong guesses. I've never used C++, I'm
not advocating technical guards as the solution to this problem, I'm
not advocating small languages.


Perhaps you've read other peoples' posts and thought I wrote them?


I'd just rather spend my time getting real programming done (meaning
writing documents that other humans, not just compilers, can read)
instead of dealing with the immense complexity we've got as a result
of worrying only about typing just enough to get the computer to do
what we want. A powerful, _well-designed_ language would let me
do that. I have no interest in Smalltalk (having explored it),
little in Java (though it'll likely be "no interest" once I do explore
it), and so on, because while they're nifty "little language", they're
not enough better at letting me _express_ what I mean than what I'm
already using (C, typically) to make the loss of power and portability
worthwhile.


And since most of the programming I've done in my life has been on
programs having at least 100,000 lines of code, a couple of which have
had a million or so lines, and most of which have been worked on by
several programmers at the same time, I'm not really interested in toy
languages, or languages designed assuming all sorts of safety
mechanisms built in to the run-time environment.


It's my experience having done this kind of thing coupled with my
increasing understanding of language design that results in my having
known better than to get involved with, much less suggest, C++ as a
language for medium-sized (or larger) application design (such as a
compiler) -- despite the fact that I'm "starved" for some of the nifty
features of that language (and others as well).


> I prefer to leave the language itself as simple as can achieve the
> desired utility rather than burdening it with protective mechanisms
> for one main reason: languages are devised by humans and are
> themselves flawed, the larger the language, the more potential
> interactions unexpected by the designers and worse by the
> users. Fortran 90 and C++ are both appalling languages from my point
> of view as the programmer's conception of how the language works
> usually differs radically from the designers and the
> specification. Algol 68 for all its flaws succeeds by economy of
> design.


I don't know Algol 68 but I know you're right about C++ and think
you're right about Fortran 90 (though, at least, in Fortran 90 you
still have reason to be confident that `A = B + C' modifies neither B
nor C, and yell at anyone who writes code that makes that
assumption false).


But I'm _not_ advocating "burdening [the language] with protective
mechanisms". I'm advocating _designing_ it using _linguistically_
protective mechanisms. Such as the mechanism that prevents most of us
(sane folk anyway) from designing a language that looks like C or C++
but allows "a = b + c;" to be overloading to mean "subtract a from
every member of the structure in foo and store the resulting vector in
b starting at offset c". So, someone who _wants_ that would have to
_say_ that in the language, meaning they'd express it in a way that
would be clear to other programmers -- instead of relying on clever
low-level compiler tricks to make one short expression mean some
completely unrelated larger expression.


The protective mechanisms I'm talking about are "static" and applied
at _language design_ time -- that is, once, early on, by people taking
time to Get It Right. The resulting language might well include some,
perhaps many, protective mechanisms that kick in at compile time, at
link time, at run time, and so on, of course. But it's the _language
design_ issues I'm getting at in all these posts. I'd rather have
language design done by people who understand language design. (Just
as I'd rather have the code that runs the nuclear power plant down the
street designed by people who understand the relevant engineering
disciplines, not by people who have learned how to cope with
needlessly complex languages.


So I have no inherent problem with _highly_ complex languages, if the
complexity includes underlying implementations, libraries, debugging
environments, and so on. I'm asking for more thought being given to
preserving the apparent vs. actual meaning of code typically written
in the language, e.g. `A = B + C' meaning "add B and C and store the
result in A" (or, better yet, "A is known to be equivalent to the sum
of B and C", and offer a different notation for imperative
compute-and-store, e.g. `:=').


It's the technical stupid-pet-tricks I object to -- the ones that make
code typically written in the language harder to understand without
lots of context, employed because some "language designer" decided
some nifty feature (like operator overloading, or preprocessing, or
whatever) could be used to do things that _should_ require thoughtful
language design. In other words, the feature was extended well beyond
what it originally _was_ thoughtfully designed to do. Operator
overloading has existed for at least three decades in production
compilers. Only recently did it become popular among thousands of
programmers to think it was "cool" to define `+' as concatenation when
applied to strings while it still meant addition when applied to
non-strings.


(VOLATILE variables are another example I've posted about like this.
They make both Fortran and C worse when they are added to those
languages. A "real" language wouldn't have them. The
stupid-pet-trick here is "see what I can do by just making up new
attributes for variables", yet by this I do _not_ mean that attributes
are stupid-pet- tricks themselves. The proper language facility would
be something like functions and subroutines to explicitly read and
set/modify a given memory location, and the proper implementation
would be high-quality in-lining of those functions and subroutines to
achieve the same run-time effect of volatile variables, without having
the language have a feature that could change `a = b - b' to no longer
mean the same thing as `a = 0'. As a result, programmers who really
wanted the equivalent volatile variable would actually have to type `a
= read_b() - read_b();' in C or, in Fortran, "even worse", type `call
read_b(val1); call read_b(val2); a = val1 - val2'. Funny thing how
both of those latter two are _clearer_ to readers of the code than `a
= b - b' with `volatile b;' appearing somewhere earlier in the code.
And, funnier still, the Fortran version is yet clearer, since it shows
the order in which the reads actually happen -- which can be important
in cases more complicated than this -- and since it more explicitly
suggests that the values obtained might actually be different.)


You and I, and perhaps others, should go back and read the first
post(s) to which I responded -- my recollection is that I was
objecting to the assumption that operator overloading always was great
because it "solved" all sorts of language problems, but maybe I'm
wrong about that. I do believe overloading is helpful, but when
allowed to go beyond the bounds of reasonable use for linguistic
expression, rapidly makes the language use (even as it makes the tool
more powerful).


Note that I'm not a big advocate of "safe" language implementations.
I think they're nice, of course, but I rarely use them myself,
preferring to code my own safety-protection in a simpler language
(e.g. C) for areas I am suspicious about. Yes, it's incomplete. Yes,
I'd like an ideal language that checked everything for me infinitely
fast. But, in practice, I'd rather have a language that allowed me to
express what I mean and know about a problem, including separating
expressions of goals from designs from implementations, even if it
didn't actually implement much of the resulting expression "space".


(In a sense, safe languages can end up being plagued by technical
stupid-pet-tricks themselves. The language designers might end up
assuming there's no reason to provide certain language constructs on
the theory that, since their implementations are all going to do
safety checking, they won't actually accomplish anything. So you have
a language where you can't actually specify that a variable or
function is single-precision floating-point, because that'd only be
useful to defeat the safety checking they've provided, which they
don't want you to do. Never mind that it might be important to tell
readers of the code that single-precision floating-point is all that
is needed for a particular variable -- the designers of the language
decided that, since their compilers don't need to know this, neither
do your readers. "Besides," they say, "you can always add comments."
That's a statement that, when I hear it, confirms for me that the
speaker has probably never worked on a serious medium-to-large-scale
multi-programmer programming effort that actually was completed and
maintained over a reasonable period of time.)


In particular, a language that allows overloading of implementation of
operators like + is fine by me, but I'd consider it a better language
if it didn't allow overloading of the _meaning_ of + at the design
level. Yes, playing games at the implementation level can lead to
problems if the language doesn't "protect" me, but it can't protect me
against all sorts of other things as well. With such a language, I
can be much more persuasive "converting" someone who's kludged an
implementation of + to mean concatenation than I can with someone
who's overloaded + that way in C++ (obviously, since my posts probably
have changed few readers' minds about the coolness of overloading + to
mean concatenation). The possibility that neither C++ nor my
theoretical language would ever have implementations that caught
actual attempts to implement + as concatenation is, for me, almost
irrelevant.
--
James Craig Burley, Software Craftsperson burley@gnu.ai.mit.edu
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.