Re: User definable operators

Jerry Leichter <leichter@smarts.com>
27 Dec 1996 23:24:38 -0500

          From comp.compilers

Related articles
[10 earlier articles]
Re: User definable operators tim@franck.Princeton.EDU (1996-12-20)
Re: User definable operators nkramer@cs.cmu.edu (Nick Kramer) (1996-12-20)
Re: User definable operators hrubin@stat.purdue.edu (1996-12-24)
Re: User definable operators preston@tera.com (1996-12-26)
Re: User definable operators burley@gnu.ai.mit.edu (Craig Burley) (1996-12-26)
Re: User definable operators mfinney@inmind.com (1996-12-26)
Re: User definable operators leichter@smarts.com (Jerry Leichter) (1996-12-27)
Re: User definable operators genew@mindlink.bc.ca (1996-12-28)
Re: User definable operators WStreett@shell.monmouth.com (1996-12-29)
Re: User definable operators adrian@dcs.rhbnc.ac.uk (1997-01-02)
Re: User definable operators hrubin@stat.purdue.edu (1997-01-02)
Re: User definable operators anw@maths.nottingham.ac.uk (Dr A. N. Walker) (1997-01-03)
Re: User definable operators WStreett@shell.monmouth.com (1997-01-03)
[3 later articles]
| List of all articles for this month |
From: Jerry Leichter <leichter@smarts.com>
Newsgroups: comp.compilers
Date: 27 Dec 1996 23:24:38 -0500
Organization: System Management ARTS
References: 96-12-088 96-12-110 96-12-147 96-12-163 96-12-171
Keywords: design, APL

There are user-defineable operators, and there are user-defineable
operators. :-)


Consider APL. One thing that is often overlooked is that, as the term
is usually used, APL *has no user-defineable functions*. In fact, it
has no functions at all. It has *only* operators. In APL, you can
define "functions" with 0, 1, or 2 arguments; and they can optionally
return a result. A 0-argument function which returns a result is
indistinguishable from a variable; a 0-argument function that returns
no result can only appear on its own on a line (pretty much). A
1-argument function is written before its argument (no parens are
necessary; if you put them in, they simply specify grouping as
always). A 2-argument function is written *between* its arguments,
just like any other binary operation. Since in APL all operations
have the same precedence, the usual issues of where to put
user-defined extensions don't apply. Syntactically, there is *no*
difference between A + B and A PLUS B, where PLUS has been defined as
a "dyadic" function. (*Lexically*, as always you may need spaces to
indicate word boundaries, never an issue with operations that are
written as special characters.)


APL also has higher-order operators - if $ is a dyadic operation
symbol, $/ is "reduction by $", taking a vector and applying $ to
pairs. Thus +/ is summation. In recent versions of APL, I believe, $
can also be a user-defined dyadic function.


It's common to design APL workspaces that provide a set of new
"operations" appropriate to a given application area. APL users
usually think of these as language extensions. And why not?


On the other side, in mathematics it is indeed extremely common to
overload the meaning of operators. "+" may be an operation on reals,
integers, members of Z/n, matrices, members of some arbitrary ring,
field, vector space, or what have you. It may mean several of these
things in a single equation. One of the things that keeps mathemati-
cians sane is that there are agreed-upon standards about what can and
cannot be re-defined freely. Yes, "+" can be an operation on almost
any space - but it would be extremely unusual for it to be anything
but a commutative and associative operation, and there is almost
certainly an identity element, probably written as 0. Multiplication
- written as a a centered dot - may also apply to all kinds of
objects. No matter what it's applied to, the multiplication symbol
can be elided. Multiplica- tion - whatever it means for the objects
involved - can be taken to be associative; it is *not* necessarily
commutative. If there is a + operation, there is a corresponding
\Sigma "reduction" operation; and if there is a multiplication
operation, there is a corresponding \Pi. \pi, on the other hand, if
used as a function, is often (but not always) a permutation - but in
any case it can be re-defined. \pi used as a variable is 3.14159... .
Blackboard bold Z, R, and C are the integers, reals, and complex
numbers, period - they can't be redefined at all (though they can be
modified in various ways, e.g., subscripts, super- scripts, primes,
etc.)


None of these rules are likely to be written down anywhere, and any of
them *could* be violated if the needs of exposition were great enough.
But a mathematician who regularly ignores these rules will get a lot
of pressure from colleagues and publishers to change his ways.
Mathematics is hard enough that no one wants to waste time on some
totally odd-ball notation for no good reason.


One problem we have in the programming world is that there is little
effective pressure on programmers who want to be "creative". Everyone
will agree that overloading "+" to work on bignums is fine.
Overloading it to mean "zero the left argument and write the second to
standard output" would strike most programmers as bizarre.
Overloading it to mean "concatenate strings" is somewhere in between.
However, other than raving at the incompetence of a programmer would
would create that second definition, we take a black-or-white
approach: Those who want power and freedom allow arbitary overloading;
those worried about the comprehensibility of the resulting code allow
none.


The intermediate step - semantic constraints on what you are allowed
to do - is rare. Ada, for example, allows you to override = (test for
equality) - but it then automatically overrides /= (is that the symbol
for "not-equal"?) to mean the logical complement. A *very* reasonable
restriction, in my opinion. We could extend the idea. Suppose you
could redefine binary + and unary minus, but the compiler was then
allowed to assume that your + was also commutative and associative,
that --a == a; and that a-b was equivalent to a+(-b); etc. Given
that, I can see little problem with over-loading "+" and "-". (Of
course, this would *not* allow "+" to mean string concatenation!)


The problem is bigger than just operator symbol overloading. If class
A inherits from class B, and A then overrides the definition of a
function defined in B, it would be bizarre and error-prone if the
version in A did something inconsistent with the version in B. After
all, an A object is supposed to be a kind of B object, so the function
ought to be doing "the same thing" in some broad semantic sense, even
if the specific implementation has to be different. However, no
language I know of gives you any way of expressing the constraints
that matter. (Hmm, do over-ridden functions in Eiffel inherit the
pre- and post- conditions of their ancestors? That would be a good
start.)


The big problem with this kind of approach is that there is no way, in
general, for a compiler to check your assertion that "+" has the right
properties. Still, language definitions are always going to be full
of cases in which, if you don't follow the (uncheckable) rules, the
results are undefined. At least in this case you'd have unimpeachable
authority to cite in the language definition when complaining about
the guy who wants a+b to mean "zero a..."!
-- Jerry
[APL is indeed pretty consistent, but the last time I checked you couldn't
reduce using a user-defined function, e.g. no foo/blah where foo is a user
defined function. -John]




--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.