Re: Prefix, infix and function-call and their implications in embedded language readability

"bartc" <bartc@freeuk.com>
Thu, 21 Jan 2010 21:13:04 GMT

From comp.compilers

Related articles
Prefix, infix and function-call and their implications in embedded lan pengyu.ut@gmail.com (Peng Yu) (2010-01-20)
Re: Prefix, infix and function-call and their implications in embedded gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-01-21)
Re: Prefix, infix and function-call and their implications in embedded herron.philip@googlemail.com (Philip Herron) (2010-01-21)
Re: Prefix, infix and function-call and their implications in embedded kkylheku@gmail.com (Kaz Kylheku) (2010-01-21)
*Re: Prefix, infix and function-call and their implications in embedded bartc@freeuk.com (bartc)* (2010-01-21)**
Re: Prefix, infix and function-call and their implications in embedded monnier@iro.umontreal.ca (Stefan Monnier) (2010-01-25)

| List of all articles for this month |

From:	"bartc" <bartc@freeuk.com>
Newsgroups:	comp.compilers
Followup-To:	comp.lang.misc
Date:	Thu, 21 Jan 2010 21:13:04 GMT
Organization:	Compilers Central
References:	10-01-069 10-01-076
Keywords:	design, syntax
Posted-Date:	21 Jan 2010 20:48:41 EST

[I'm ending the thread here, since we're drifting into punctuation theology. Feel
free to continue the argument in comp.lang.misc if you're so inclined. -John]

"Kaz Kylheku" <kkylheku@gmail.com> wrote in message
> On 2010-01-21, Peng Yu <pengyu.ut@gmail.com> wrote:
>> Consider the following three expressions, which are valid C, mit-
>> scheme and Mathematica expressions. There are of course many other
>> expressions that express the same thing in other languages, or in the
>> same language but other different ways.
>>
>> 3+2*5>7
>> (> (+ 3 (* 2 5)) 7)
>> Greater[Plus[3,Times[2,5]],7]
>>
>> Apparently, at least to me, the first expression is the most readable.
>
> Really? What if we replace 2 3 5 7 by a b c d, and then change

OK, so we have b+a*c>d; looks still readable to me.

> the meaning of the operators,

Perhaps one can do that in Scheme too: change the meaning of the operators.

> or give them a precedence you aren't
> accustomed to?

That would be ill-advised, a bit like mixing up the digits so that "3" means
five, "5" means two, and so on. These things are ingrained into most of us.

> What if 3+2*5>7 is actually a Smalltalk expression, such that it just
> means ((3+2)*5)>7?

Eh? What if <any arbitrary program code> is a Smalltalk expression?

>> One possible reason is that we learn this algebraic notation much
>> earlier than the other two, which is in analogy to that we can respond
>> to the native language (say, English) much faster than to a second
>> language (say, French).
>
> Another possible reason is that the algebraic notation has only a few
> operators, whose precedence you have memorized (and are assuming to
> hold true of the expression above).

> Would it still be readable if the grammar had 500 operators,
> arranged into 200 precedence levels?

If you didn't know them, then no. But the standard operators, the ones using
symbols, are few. No reason not to make use of them. And there is nothing in
C that stops it defining alternate names for the operators and using
function syntax:

  gt(add(3,mul(2,5)),7)

> Another reason is that because you have a few operators, you can use
> special
> glyphs for them, which are distinct from numbers and variables.

Yes, that's why it is useful to use the special glyphs; it gives 'shape' to
an expression making it easier to grasp.

> That second Lisp notation is unambigous. So we can replace all of the
> non-punctuation symbols, and still recognize the tree shape as being the
> same,
> provided we keep the parentheses in the printed notation as they are:
>
> (> (+ 3 (* 2 5)) 7)

Parentheses can optionally be used in the C version too.

> If we substitute the non-punctuation symbols of the infix expression, we
> are
> lost; there is no explicit grouping there to retain:
>
> G0001 G0002 G0003 G0004 G0005 G0006 G0007
>
> Can you remember that G0002 and G0004 are binary operators,
> and that G0004 has a higher precedence than G0002?

Again, nothing stops anyone putting parentheses here. Scheme (or whatever
the other syntax is) just happens to require them all the time.

(Actually it is this very monotony of Lisp-like syntaxes which make them
difficult to grasp for some (most?) people. It is the variety of syntax and
use of special glyphs that makes other languages easier on the eye.)

> When prefix notations get long, we can easily break them into multiple
> lines
> using a few simple guidelines, e.g.:
>
> (G0001 (G0002 G0003
> (G0004 G0005 G0006))
> G0007)
>
> This we can easily visualize the structure as a tree printed sideways.

        3
    +
            2
        *
            5
>
    7

It doesn't do much for me however... (and I think it might be reversed left
and right).

> Suppose that small subexpressions found in a 500,000 line program
> are all beautifully micro-readable. Suppose you need to make a
> small change to one of them. What if it turns out that the program
> has 10,000 other expressions similar to that one (but not exactly
> the same), and they /all/ have to be found and changed in an
> analogous way in order for your proposed change to work properly?
> Oops.

In what way is that different to changing 10000 expressions in the Scheme
code?

--
bartc

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Prefix, infix and function-call and their implications in embedded language readability

"bartc" <bartc@freeuk.com>Thu, 21 Jan 2010 21:13:04 GMT

"bartc" <bartc@freeuk.com>
Thu, 21 Jan 2010 21:13:04 GMT