Re: Grammars for future languages

wdavis@dw3f.ess.harris.com (Bill Davis)
Thu, 26 Oct 1995 18:15:58 GMT

          From comp.compilers

Related articles
Grammars for future languages schinz@guano.alphanet.ch (1995-10-22)
Re: Grammars for future languages torbenm@diku.dk (1995-10-24)
Re: Grammars for future languages mfinney@inmind.com (1995-10-24)
Re: Grammars for future languages RWARD@math.otago.ac.nz (Roy Ward) (1995-10-26)
Re: Grammars for future languages wdavis@dw3f.ess.harris.com (1995-10-26)
Re: Grammars for future languages timd@Starbase.NeoSoft.COM (1995-10-30)
Re: Grammars for future languages jaidi@technet.sg (1995-11-09)
Re: Grammars for future languages martelli@cadlab.it (1995-11-04)
Re: Grammars for future languages schinz@guano.alphanet.ch (1995-11-05)
Re: Grammars for future languages ECE@dwaf-hri.pwv.gov.za (John Carter) (1995-11-07)
Re: Grammars for future languages mbbad@s-crim1.daresbury.ac.uk (1995-11-08)
[9 later articles]
| List of all articles for this month |
Newsgroups: comp.compilers
From: wdavis@dw3f.ess.harris.com (Bill Davis)
Keywords: syntax, design
Organization: Harris GISD
References: 95-10-103
Date: Thu, 26 Oct 1995 18:15:58 GMT

schinz@guano.alphanet.ch (Michel Schinz) writes:
|>The grammars of the majority of today's programming languages (C[++],
|>Ada, Pascal, Eiffel, Dylan, etc.) are Algol-like. By this, I mean that
|>they have a different syntax for almost every concept and special
|>support for arithmetic operators.


I agree with the different syntax, but I would extend your statement
by removing the word "arithmetic". That is, there is special support
for operators, arithmetic or otherwise. The ANSI C string concatenation
which makes whitespace an operator between two adjacent string
literals is another good example of a special case operator.


|>However, there are (at least) two main exceptions: Lisp-like grammars
|>and Self/Smalltalk-like grammars.


I have not looked at the language, but I understand that ML is a functional
language which is unlike LISP. So there may be more exceptions than
listed above or below.


Among the procedural languages there is at least Logo, Forth, and APL which
do not follow your above pattern. Forth is a postfix language without
abstract types. APL has all operators/functions being on rather equal
footing. Logo is a prefix language.


Also, there is Prolog which is non-procedural.


|>Algol-like grammars are believed to be easier to understand and closer
|>to the usual (mathematic) notations and english. On the other hand,
|>they have problems: they are big (hard to learn and remember) and the
|>operator/function-call distinction is a big problem. For example, in
|>C++ you can overload existing operators but you cannot define new
|>ones. In Eiffel, you can define new operators, but you cannot define
|>their priority and associativity.


In languages without overloading, some operators are special cases,
but the use of infix arithmetic operators is obvious. The infix
notation is natural because that is what we are taught in school (at
least in USA this is true). You need to mentally adjust from 'x' to
'*' for multiply, but that is not very hard. It would be better not
to have to adjust, but using 'x' as an operator is not very desirable
either.


When you move beyond the basic four operators (+,-,*,/) there are
more chances for problems because no one intuitively thinks that the
caret (^) stands for anything until they are taught a meaning.


So, IMO, the distinction between function call and operator and the
restriction on operator overloading is due more to language design
mistakes. I have tried to design a language with fully definable
operators (including varying precedence) and found that allowing both
of these on all operators is inherently ambiguous. Also, letting the
programmer define precedence is dangerous because you lose the
intuitive value of infix: a + b * c + d


|>On the other hand, Lisp-like and Smalltalk-like grammars are very
|>simple: only one or two notation are used for everything. However,
|>the notations used for arithmetic operations do not conform to the
|>mathematical notation.


True. Also, the lack of precedence in operators in Smalltalk is
worse than having no operators. The meaning of
          (a add: (b times: c)) add: d
is clear but the meaning of
          a + b * c + d
is not the obvious one of
        a + ( b * c ) + d
as it should be. This is a case where the use of operators becomes
a special case again because it does not match "normal" expectations.


|>For example, many people think that this (Ada) statement:
|>
|> if a=2 then
|> Put_Line("A = 2");
|> else
|> Put_Line("A /= 2");
|> end if;
|>
|>is easier to understand than this (Self) one:
|>
|> a=2 ifTrue: [ stdout write: 'A = 2' ]
|> False: [ stdout write: 'A /= 2' ].
|>
|>or this (Lisp) one:
|>
|> (if (= a 2)
|> (write-line "A = 2")
|> (write-line "A /= 2"))


When looking at languages, there is more than one piece of isolated
code. Also, consider that the "if" in Lisp is not the basic
conditional but was added later as an easier interface than cond.
This in itself is a clear admission that the cond is Lisp is harder
to use and understand than the if. This argument is not mitigated by
the fact that many people can read a cond as easily as they can read
an if statement. Many people can also read multiple human languages
but that doesn't mean we want to try to teach everyone many different
foreign languages.


|>My claim is that this may be true for people who already know a
|>"classical" programming language (Pascal, Basic, etc.) but I do not
|>think that this is true for complete beginners, who certainly do not
|>understand any of them. I even think that complete beginners will
|>understand simple (i.e. Lisp- or Smalltalk-like) grammars quicker,
|>precisely because of their simplicity. To quote the "candygrammar"
|>entry in the Jargon File 3.2.0:
|>
|> This intention comes to grief on the reality that syntax
|> isn't what makes programming hard; it's the mental
|> effort and organization required to specify an algorithm
|> precisely that costs.


But the claims of "easier understanding" based on is already a syntax
claim. The examples of languages above are all giving different
syntax for the same semantics. Thus, syntax is an important part of
the mental effort.


|>For example, I remember clearly that when I learned my first
|>"programming language" (Commodore 64 Basic :-), I had troubles
|>understanding the concepts, not the syntax.


But unless you understand both, there will be problems. And a
language with easy concepts but complex syntax, such as templates in
C++, can still have problems. Or a language with simple syntax but
complex semantics, such as automatic conversion in C++, will also
cause problems.


|>Also, even if being close to the mathematical notation was once very
|>important, because the vast majority of programs used mathematics a
|>lot, this isn't true anymore. Ok, there are still a lot of
|>mathematical programs, but there is also a wide range of computer
|>applications which simply do not need a special notation for
|>arithmetic operations (compilers are an example).


I am working with language design as a long term area of study.
There are some expressions that are easy to write with infix
notation that become much less clear with other notations.
Compare the infix version:
          a [ i + 1 ] = a [ i ] + 1
to a Smalltalk type language without operators:
          a at: ( i copy add: 1) put: ( ( a at: i ) add: 1)


One thing the above example points out is that infix operators have
hidden temporaries which are a source of expressive power. A badly
designed language that does not guarantee the behavior and lifetime
of temporaries can cause problems. Look at the ARM for C++ for a
good example of these problems.


|>I therefore think that grammars for new languages should not be
|>Algol-like but Lisp- or Smalltalk-like (or anything similar).


I tend to agree, but notice that my pure Smalltalk-ish language
example above may not be the best approach either. I am thinking
that a combined approach may be best, but there are obviously
problems to work out with combined approaches.


|>I think that this issue is an important one, because if all new
|>languages are designed to have a simple grammar, parsing could slowly
|>become much easier, and its importance in compilation would decrease.


I explore language issues because languages are the way that we think
rationally. Ignoring emotions, feelings, and other non-verbal
"thinking", we are constrained to what we can express in language.
Thus, I think a good language will make it easy to express good
programs and may make it harder to express bad programs. This would
be a good contribution to software quality.


--


Bill Davis
wdavis@dw3f.ess.harris.com
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.