Re: Definable operators

Jerry Leichter <>
6 Apr 1997 22:24:11 -0400

          From comp.compilers

Related articles
[9 earlier articles]
Re: Definable operators (Henry Spencer) (1997-03-31)
Re: Definable operators (1997-03-31)
Re: Definable operators (1997-04-02)
Re: Definable operators (Dave Lloyd) (1997-04-02)
Re: Definable operators (Craig Burley) (1997-04-03)
Re: Definable operators (Francois-Rene Rideau) (1997-04-03)
Re: Definable operators (Jerry Leichter) (1997-04-06)
Re: Definable operators (1997-04-11)
Re: Definable operators (1997-04-16)
Re: Definable operators (Matthew J. Raw) (1997-04-16)
Re: Definable operators (1997-04-16)
Re: Definable operators (Tony Finch) (1997-04-18)
Re: Definable operators monnier+/news/comp/ (Stefan Monnier) (1997-04-18)
[23 later articles]
| List of all articles for this month |

From: Jerry Leichter <>
Newsgroups: comp.compilers
Date: 6 Apr 1997 22:24:11 -0400
Organization: System Management ARTS
References: 97-03-037 97-03-076 97-03-112 97-03-115 97-03-141 97-03-162 97-04-018
Keywords: syntax, design

| > People can cope with plus used to mean addition on numbers, even odd
| > kinds of numbers, or things very similar to numbers. But when plus
| > sometimes means addition, and sometimes means string concatenation
| >(a very different operation, despite some limited similarities),
| >trouble is likely.
| This is all fluff and waffle. Mathematicians often define addition
| via concatenation (remember the good old successor function to induce
| the integers?). They are the same basic operation - intuitively and
| mathematically....

Completely wrong - and a great example of the underlying issues.

No mathematician would, in mathematical usage, *ever* use + for string
concatenation, for a very simple reason: "+" and related symbols (+ in
a circle, capital sigma for repeated plus) are used for many different
operations in different contexts, but always (OK, almost always - I
can't think of an exception, but I'm sure *someone* will come up with
one!) for *commutative* operations.

Multiplication, on the other hand - written as x, possibly in a
circle; or as a centered dot; or as simple juxtaposition; with capital
pi for repeated "multiplication" - is often *not* commutative. It is,
however, pretty universally associative.

These conventions - which probably go back to matrix algebra - make it
possible to read unfamiliar papers and manipulate their notation with
some degree of safety. There are similar conventions for things like
relations symbols: Something that looks like >, perhaps > with a dot
in the middle, or > with wiggly lines, is transitive and
antisymmetric. Adding an extra bar underneath adds "or equal" to the
meaning. "=" and its variations - wavey lines, dot on top, triple -
is an equivalence relation.

String concatenation is like multiplication, not addition. In fact,
sets of strings under concatenation form a common structure, the semi-
group; semigroup operations are often written as centered dots.
Concatenation is most widely written as either a centered dot, or as
simple juxtaposition. So the Algol68 standard definition of + for
string concatenation mentioned elsewhere was a mistake.
(Interestingly, the other Algol68 convention - n*s meaning n copies of
the string s - is somewhat better founded: It's common to think of the
integers "acting on" some arbitrary structure. If there's an additive
operation, using n*o for o+o+...+o - n copies - is pretty natural and
reasonably widely used. On the other hand, no one would write o*n!)

If people followed similar kinds of conventions in defining new
operations in programming languages, there would be much less of a
problem. If compilers could *enforce* such conventions - something
that is, unfortunately, generally impossible to do - operator
re-definition would be innocuous and often useful. But what really
happens is that the texts give simple examples that *do* follow the
rules - e.g., extending the arithmetic operations to a "complex" class
- and then some programmers go wild, defining operators everywhere.
Given the small set of available operation symbols, many of the
re-definitions are certain to cause readers of the programs headaches.

It seems to me that there are two classes of "good" uses for operator

1. For operators with widely understood basic semantics,
extension to new domains in which those semantics
make sense, and are followed. Examples here are
almost all mathematical, and include things like
complex and matrix arithmetic.

2. For operators that either have no widely-used meaning at
all, or have a limited meaning, just about any
*self-consistent* usage is OK. The use of << and >>
for I/O in C++ are a good example: << and >> are
rare in mathematics ("much less than/greater than"),
and that meaning is unlikely to be significant in
programming usage. Shifting isn't all that common
an operation in C or C++, and has no properties (like
commutativity or associativity) that anyone thinks
about. So giving these operators entirely new meanings
makes perfect sense.

On the other hand, the very fact that << and >> are
so widely used in C++ removes them from the set of
operators that could reasonably be freely defined.
A programmer who defines << and >> to have any meaning
other then string I/O can expect the (justified) wrath
of those who later read his code (except, possibly, if
he defines them to be shift operations on some new kind
of numeric datatype).

-- Jerry

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.