Re: User definable operators

WStreett@shell.monmouth.com (Wilbur Streett)
3 Jan 1997 23:12:25 -0500

          From comp.compilers

Related articles
[16 earlier articles]
Re: User definable operators leichter@smarts.com (Jerry Leichter) (1996-12-27)
Re: User definable operators genew@mindlink.bc.ca (1996-12-28)
Re: User definable operators WStreett@shell.monmouth.com (1996-12-29)
Re: User definable operators adrian@dcs.rhbnc.ac.uk (1997-01-02)
Re: User definable operators hrubin@stat.purdue.edu (1997-01-02)
Re: User definable operators anw@maths.nottingham.ac.uk (Dr A. N. Walker) (1997-01-03)
Re: User definable operators WStreett@shell.monmouth.com (1997-01-03)
Re: User definable operators apardon@rc4.vub.ac.be (1997-01-07)
Re: User definable operators icedancer@ibm.net (1997-01-07)
Re: User definable operators wclodius@lanl.gov (William Clodius) (1997-01-09)
| List of all articles for this month |

From: WStreett@shell.monmouth.com (Wilbur Streett)
Newsgroups: comp.compilers
Date: 3 Jan 1997 23:12:25 -0500
Organization: Monmouth Internet
References: 96-12-088 96-12-163 96-12-181 96-12-185 97-01-015
Keywords: syntax

hrubin@stat.purdue.edu (Herman Rubin) wrote:


>Wilbur Streett <WStreett@shell.monmouth.com> wrote:
>>>>Notation has to be overloaded to be of reasonable length.
>
>>So what is more important is that the notation is of "reasonable"
>>length than it follows generally accepted and defined abstractions?
>
>Whose generally accepted and defined abstractions?


It's not a question of who, it's a question of expectations. In the
context of this discussion, there are two reasons to write software,
to get a computer to perform a specific task, and to communicate the
process to the person that is going to maintain or control the
software after it's written. But to answer the question, the guy who
makes up the game defines the rules. If you want to play the game,
you have to play by the rules.


Given a programming langage is an abstraction, it will always have
some level of abstraction, which will be implemented in it's
structure. What is more important is that this abstraction can be
communcated to another human being easily in understandable chunks.


By overriding notation, the fundamental structure we use becomes
something that can't be taken for granted. Interlacing program
structure with changes in notation that require religious arguments as
to what the notation actually means do not further program
readability. I personally write software so that a programmer with 6
month experience can read it. That means that I can read it a few
years from now when I'm in a hurry and don't have the time to rebuild
all of the notation in my head.


> There was no reluctance on the part of the computer language people
> to take quite standard mathematical symbols, sometimes even
> overloaded in mathematics, and use them with totally different
> meanings, and the mathematical meanings were made unusable. I can
> come up with more than a dozen such. At least the originators of
> Fortran apologized for their restrictions, and the use of * and **,
> because of the capabilities of their hardware.


I don't like that either. But part of learning a new language is
learning the conventions. I think that reading a piece of software is
like learning a new language. I hope that a program is more like a
person speaking than a whole new language. With notation overloading,
it's a new language with every line, and the same symbol on two
different pages may mean two very different things.


> Also, the mathematical conventions evolved. Anyone could put any
> notation in his papers, WITH EXPLANATION, and some of them stuck.
> As for his own use, why should anyone care? To introduce them in a
> program would require telling the compiler what they mean, and I
> doubt that the compiler would care.


Code typically isn't just read by a compiler, and when the answer is
arrived at the first time, that doesn't mean that that code is useful.
Of course, if you aren't writing for anyone other than yourself,
that's different again. But I've discovered from experience that code
that I doodle together typically ends up being code that I want to use
somewhere, and I end up having to go back and document it and make it
robust. With that in mind, have you ever tried to read "doodled"
code? The conventions are a personal issue, but the intent is to make
the code readable by another.


>>Suppose for a minute that I did the same with the English language?
>>For the sake of demonstration I decided to change what each of the
>>words in the previous sentence mean. Then you have to resort to the
>>more extended reference to determine what the previous sentence means,
>>because you have to check to be sure if the notation is the expected
>>notation or the new extended notation. That means that the notation
>>is NOT of reasonable length, but that you have to be sure to
>>understand all of the supporting notation (which is not in front of
>>you) in order to be able to understand the notation in front of you.
>
>If one starts a problem by saying "let x be ...", etc., this can be
>referred to WHEN NEEDED, but the logical structure understood. The
>same holds for operators. But in many cases, these "new" operators
>are the standard ones in mathematics, which the user already knows, or
>are such simple things as abbreviations for "pack" and "unpack".


And in many cases these operators aren't the standard ones in
mathematics. Often the people simply overload the functionality
without thinking about the side effects in terms of readability,
functional integration with the whole, etc. The example of
overloading "+" when working with strings comes to mind. Check out
"+" bug in JavaScript for an example of what can happen when the use
of the operators is not specifically defined.


>The compiler has to convert things to machine language, anyhow, and
>the user should be able to get at that, and not have to use the
>assembler language, designed with the idea of making it difficult to
>for a person to use. I would have little problem with doing that,
>myself. A versatile macro expander, with the macro structure being up
>to the user, and using weak typing, would go a long way here and
>elsewhere. This would be a totally non-optimizing mini-compiler for a
>language with few constants.


Assembler was designed to make it easy for a person to use. The lack
of standardization is what has made High Level languages popular. I
think that you are defining C in the rest of the above.


>>> By the time one reads the lengthy variable names which seem to
>>> delight the computer people, the structure of the expression is
>>> lost.
>
>>So long words confuse you? I don't like them either. But it's not
>>the length of the words that make a program structure readable or
>>unreadable.
>
>I suspect that most mathematicians would disagree highly. We start
>teaching the use of short symbolic formulation early in algebra, and
>so this is not a problem. We can then look at the structure of the
>statements without worrying about the usually irrelevant meaning of
>the variables involved.


But that's in Mathematics. There isn't much math as complicated as
computer systems. I also suspect that mathematicians wouldn't
disagree with clearly stating base assumptions and working from there
with existing notation. After all, that's what Einstein did.


>>> It is necessary to let the user invent notation, if necessary, and
>>> for the language and compiler to help, not hinder.
>
>>The user can invent notation in most computer languages. The question
>>is not whether or not they can invent notation, but in what fashion
>>they will be allowed to invent that notation and what safeguards there
>>are in the language design to insure that they are clearly documented
>>as being invented notations as opposed to intrinsic ones.
>
>A few languages allow the introduction of new types, but do not allow
>them to be called types. There were older languages which did. And
>introducing functions is not the same as introducing operators.
>Fortran compilers did optimizations at compile time for the power
>operator which become far more expensive if attempted at run time; the
>compiler is highly branched already, as it has to parse. Also,
>functions must be in prenex form, while operators can be in any order.


Mathematics is very order specific. As I see it, the problem with
this approach in software is one of stacking the definitions of types.
Four and five levels of abstraction stacked up on top of each other in
a computer program doesn't lead to code that can be understood and
debugged. I'd also argue that that sort of stacking of abstraction is
unnecessary in most cases. The argument is that with types you can
define more abstraction, but the problems with most of the code that
I've seen isn't that the code doesn't have enough abstraction, but
that it has too much inappropriate abstraction.


Taking the idea of "+" again, how do you know if the notation has been
overloaded completely and what the implementation's abstractions are?
There isn't any guarentee that a notation for a given type is going to
run through the same code that another line that appears exactly the
same is going to run through. What follows is that you can then have
two lines of code that "appear" to be exactly the same, but then have
intermediate layers of complexity that can't be identified directly.


A + B when the variables are number means add them together
A + B when the variables are strings means concatenate the strings


What does A + B mean when A was a string, but B is now a number?


Where was the type of A and B defined? Is it within a single screen?
A single module? A single program? Several programs? Were A and B
used in different ways in different places? Are the types defined
numbers or strings or something else?


I don't believe in overloading VARIABLE NAMES within a program. I
certainly would not agree with overloading other notation.


Wilbur
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.