Re: Grammars for future languages

schinz@guano.alphanet.ch (Michel Schinz)
Sun, 5 Nov 1995 17:26:20 GMT

From comp.compilers

Related articles
[2 earlier articles]
Re: Grammars for future languages mfinney@inmind.com (1995-10-24)
Re: Grammars for future languages RWARD@math.otago.ac.nz (Roy Ward) (1995-10-26)
Re: Grammars for future languages wdavis@dw3f.ess.harris.com (1995-10-26)
Re: Grammars for future languages timd@Starbase.NeoSoft.COM (1995-10-30)
Re: Grammars for future languages jaidi@technet.sg (1995-11-09)
Re: Grammars for future languages martelli@cadlab.it (1995-11-04)
*Re: Grammars for future languages schinz@guano.alphanet.ch* (1995-11-05)**
Re: Grammars for future languages ECE@dwaf-hri.pwv.gov.za (John Carter) (1995-11-07)
Re: Grammars for future languages mbbad@s-crim1.daresbury.ac.uk (1995-11-08)
Re: Grammars for future languages szilagyi@szilagyi.mit.edu (1995-11-09)
Re: Grammars for future languages davids@ICSI.Berkeley.EDU (1995-11-10)
Re: Grammars for future languages macrakis@osf.org (1995-11-10)
Re: Grammars for future languages mfinney@inmind.com (1995-11-12)
[5 later articles]

| List of all articles for this month |

Newsgroups:	comp.compilers
From:	schinz@guano.alphanet.ch (Michel Schinz)
Keywords:	parse, design
Organization:	Compilers Central
References:	95-10-131
Date:	Sun, 5 Nov 1995 17:26:20 GMT

Michael Lee Finney wrote:
: Exactly. And the better the notation you have available (which
: generally means a more complex syntax) the easier is to reason
: about the algorithm and the easier it is to specify.
[...]
: An example is Roman Numerals (bad <g>) and Arabic Numerals
: (good <g>). It is only recently that managable algorithms have
: been developed for multiplication and division using Roman
: Numerals. Precisely because they make it hard to to think about
: the problem. Addition is difficult enough.

I completely agree. But I think that your example shows precisely that
simple grammars should be used: "Arabic" Numerals (Indian Numerals in
fact, but that's another question) use a very simple "grammar", where
every sequence of digits is a valid and easily "understandable"
number. That's not the case with Roman Numerals, where sequences like
"IIIIIIII" are invalid.

The grammar for "Arabic" Numerals can for example be described by a
very simple regular expression: [0123456789]+. On the other hand, I
don't think that such an expression could easily be found for Roman
Numerals.

[...]
: While most programs do not use mathematics directly, they still use
: mathematics and logic indirectly. Consider the common task of
: negating the logic value of an expression. Also, as long as arithmetic
: and other operators are useful (as in C), you will find that a notation
: derived from mathematics is still the best for thinking. Even removing
: precedence (at least for common operators) can hurt severely.

I agree that there is a lot of "indirect mathematics" in most
programs, but does this really mean that a *special* notation should
be available? Most of the mathematics used in usual programs involve
only one operator at the time, so I really do not think that all the
operator stuff should be there only for that.

For example, incrementing a variable is a very common task in
programming. But is is really *that* hard to write the following
CommonLisp statement:

    (incf i)

or even (to avoid the "incf" macro):

    (setq i (+ i 1))

instead of the following Ada statement:

    i := i+1;

I don't think so... And most of the "indirect mathematics" used in
usual programs isn't much more complicated than this. So I really do
not think that the whole operator stuff (which has many inconvenients,
as I pointed in my original message) is needed in general.

[...]
: But it is not the only way that can be accomplished. I am working
: on a replacement for C++ which will compile C++ programs, but
: which do NOT have any of the built-in operators or data elements.
: I have found ways to allow the operators and their precedence
: and the "type" of the operator (prefix, infix, postfix, multifix) to
: be specified by the programmer. I have found ways to tie literals
: to user-defined classes. I have found ways to tie old-style data
: type declarations to user-defined classes. And it isn't really that
: hard.

I'm not sure that this kind of flexibility is desirable... Won't that
end in programs overloaded with operators (each with its own
priority), only understanble by people who know perfectly the
different priorities? I'm sorry to quote the Jargon File again, but
the following really shows what I mean:

    :precedence lossage: /pre's*-dens los'*j/ n. [C
          programmers] Coding error in an expression due to
          unexpected grouping of arithmetic or logical operators
          by the compiler. Used esp. of certain common coding
          errors in C due to the nonintuitively low precedence
          levels of `&', `|', `^', `<<', and `>>' (for this
          reason, experienced C programmers deliberately forget
          the language's {baroque} precedence hierarchy and
          parenthesize defensively). Can always be avoided by
          suitable use of parentheses. {LISP} fans enjoy pointing
          out that this can't happen in *their* favorite language,
          which eschews precedence entirely, requiring one to use
          explicit parentheses everywhere.

This is a problem which appears in all languages "overloaded" with
operators (C is a good example, but Perl, with its 24 priorities, is
still my favourite). Basically, my question is: do you really think
that the few advantages of operators are worth the inconvenients?

: What is hard is defining new control structures in a seamless manner.
: I know more than 20 control structures and more than 30 associated
: modifiers. This is the basis for my reasoning when I program. Could
: all of those be available in a language using only a simple syntax?
: I believe that using an orthogonal syntax the answer is yes. But not
: a small syntax (just because of the number of keywords).

A great number of keywords does not imply a complex grammar. For
example, in CommonLisp, symbols can be manipulated easily. That means
that complex macros with a lot of keywords can easily be added without
modifying the grammar. The "loop" macro is a very good example
(although I do not like it personnaly, but that's another problem).

: >I think that this issue is an important one, because if all new
: >languages are designed to have a simple grammar, parsing could slowly
: >become much easier, and its importance in compilation would decrease.

: The effort required to parse the language by the compiler (and,
: generally the amount of work needed by the compiler writer) is not
: really a relevant issue during language design.
[...]

Ok, but this is not exactly what I meant: it is clear that with
today's powerful tools, parsing is "easy". However, a complex grammar
will definitely make the compiler bigger and slower, compared to a
simple grammar.

Michel.
--

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Grammars for future languages

schinz@guano.alphanet.ch (Michel Schinz)Sun, 5 Nov 1995 17:26:20 GMT

schinz@guano.alphanet.ch (Michel Schinz)
Sun, 5 Nov 1995 17:26:20 GMT