Re: Grammars for future languages

mfinney@inmind.com
Sun, 12 Nov 1995 03:09:31 GMT

          From comp.compilers

Related articles
[8 earlier articles]
Re: Grammars for future languages schinz@guano.alphanet.ch (1995-11-05)
Re: Grammars for future languages ECE@dwaf-hri.pwv.gov.za (John Carter) (1995-11-07)
Re: Grammars for future languages mbbad@s-crim1.daresbury.ac.uk (1995-11-08)
Re: Grammars for future languages szilagyi@szilagyi.mit.edu (1995-11-09)
Re: Grammars for future languages davids@ICSI.Berkeley.EDU (1995-11-10)
Re: Grammars for future languages macrakis@osf.org (1995-11-10)
Re: Grammars for future languages mfinney@inmind.com (1995-11-12)
Re: Grammars for future languages RWARD@math.otago.ac.nz (Roy Ward) (1995-11-13)
Re: Grammars for future languages macrakis@osf.org (1995-11-13)
Re: Grammars for future languages rekers@wi.leidenuniv.nl (1995-11-14)
Re: Grammars for future languages egouriou@CS.UCLA.EDU (Eric Gouriou) (1995-11-16)
Re: Grammars for future languages sethml@dice.ugcs.caltech.edu (1995-11-21)
| List of all articles for this month |
Newsgroups: comp.compilers
From: mfinney@inmind.com
Keywords: design
Organization: In Mind, Inc.
References: 95-10-131 95-11-055
Date: Sun, 12 Nov 1995 03:09:31 GMT

schinz@guano.alphanet.ch (Michel Schinz) writes:
>Michael Lee Finney wrote:
>I completely agree. But I think that your example shows precisely that
>simple grammars should be used: "Arabic" Numerals (Indian Numerals in
>fact, but that's another question) use a very simple "grammar", where
>every sequence of digits is a valid and easily "understandable"
>number. That's not the case with Roman Numerals, where sequences like
>"IIIIIIII" are invalid.


No, what it shows is that the simplest grammar for the problem
should be used. As long as that grammar is sufficiently expressive.
The Arabic and Roman numeral systems are both expressing the
same concept. But many times (if not most), the only way in
which a simpler grammar is obtained is by restricting the expressiveness
of the grammar.


>: While most programs do not use mathematics directly, they still use
>: mathematics and logic indirectly. Consider the common task of
>: negating the logic value of an expression. Also, as long as arithmetic
>: and other operators are useful (as in C), you will find that a notation
>: derived from mathematics is still the best for thinking. Even removing
>: precedence (at least for common operators) can hurt severely.


A point that I have argued (in support of) in other threads.


>I agree that there is a lot of "indirect mathematics" in most
>programs, but does this really mean that a *special* notation should
>be available? Most of the mathematics used in usual programs involve
>only one operator at the time, so I really do not think that all the
>operator stuff should be there only for that.


The use of operators in a language is FAR more useful than the
immediate "mathematical" operators (although really, all of the
operators are mathematical if you go deep enough). And while
the expressions are typically simple, even so they read better
using an operator notation.


Basically, syntatic "sugar" is as essential for thinking as other
forms of sugar for the brain.


>For example, incrementing a variable is a very common task in
>programming. But is is really *that* hard to write the following
>CommonLisp statement:
>
> (incf i)
>
>or even (to avoid the "incf" macro):
>
> (setq i (+ i 1))
>
>instead of the following Ada statement:
>
> i := i+1;


No, but it is a lot easier to simply write either...


      ++i


from C/C++, or


      i+1


from Easytrieve. And from a readability standpoint, the operator
notation is an order of magnitude easier to read and understand.


Yes, it can be abused (as in APL) but generally operator notation
is far better.


>I don't think so... And most of the "indirect mathematics" used in
>usual programs isn't much more complicated than this. So I really do
>not think that the whole operator stuff (which has many inconvenients,
>as I pointed in my original message) is needed in general.


I wouldn't do without it, and if I have a choice, I will not use a language
without operators and precedence. I HAVE used such languages in
the past, and I know from personal experience that this is NOT, for
me, the way to go.


>I'm not sure that this kind of flexibility is desirable...


The reason for the flexiblity is NOT to provide more operators, athough
that is a nice benefit, but to make C++ a "true" OO language without
the "baggage" that it got with C. If I can strip all of the C stuff out,
but then provide a systematic way to build it back up within the
language then C++ will as OO as other OO language. Right now, C++
is a poor step cousin of OO languages because of the C baggage.


>Won't that
>end in programs overloaded with operators (each with its own
>priority), only understanble by people who know perfectly the
>different priorities?


I think that it is possible to consider the "expected" priorities
between operators and arrive at a consistent scheme which
does not have the problems of C/C++. The problem there is
that the wrong priorities were assigned to some operators.
And so people don't get the EXPECTED result -- so they say
that the operator precedence has too many levels or is bad
in general. Don't confuse bad design with basic principles.


But yes, if you have enough operators and enough levels of precedence
it can be difficult to remember all of the levels. But in many
cases, extra levels are created which are not needed. For example,
in many places you find that the "and" operator has a higher
precedence than the "or" operator. But if you look through various
books and papers you will find that some people reverse the two.


Since these are duals, there is no compelling reason at all to give
them different precedence. Of course, "not" should bind tighter.
And if you consider the logical "and" / "or" operators (instead of
the relational operators) there is no reason that shift, rotate, etc.
should be at any different precendence level.


>: The effort required to parse the language by the compiler (and,
>: generally the amount of work needed by the compiler writer) is not
>: really a relevant issue during language design.
>[...]
>
>Ok, but this is not exactly what I meant: it is clear that with
>today's powerful tools, parsing is "easy". However, a complex grammar
>will definitely make the compiler bigger and slower, compared to a
>simple grammar.


Not true. If the grammar is reasonably "regular" then the number of
syntax productions has no essential effect on the speed of parsing. The
only thing that slows parsing down is lookahead and backup. As long
as only the next character (or token) needs to be examined then the
decision process is essentially trival and independent of the number of
choices -- at least in a table driven parser. And even in a recursive
descent parser if you have more than a few choices you can always
use a better data structure to improve performance.


Actually, as I understand it, the general rule of thumb is that (outside
of optimization) the speed of compilation is essentially a function of
the time require to perform the lexical analysis. Becuase that is the
only stage of the compiler that has to handle every character in the
input and its overhead swamps the overhead of handling the tokens.


Of course when you add in optimization phases, the time for compilation
can grow exessive quickly.


And data structures are really important. In the IBM VisualAge C++
compiler, if I have a static table of 13,030 elements (which was derived
from a listing of Unicode characters) a compilation of the .cpp file
can take several hours. But if I split that up into 10 tables of 1303
elements, the compile time drops to a minute or so. They are obviously
using a REALLY BAD data structure or algorithm somewhere.




Michael Lee Finney
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.