Re: Why context-free?

nmm1@cus.cam.ac.uk (Nick Maclaren)
8 Oct 2005 17:26:27 -0400

          From comp.compilers

Related articles
Why context-free? nmm1@cus.cam.ac.uk (2005-10-06)
Re: Why context-free? cfc@shell01.TheWorld.com (Chris F Clark) (2005-10-07)
Re: Why context-free? torbenm@app-4.diku.dk (2005-10-07)
Re: Why context-free? rsc@swtch.com (Russ Cox) (2005-10-07)
Re: Why context-free? bobduff@shell01.TheWorld.com (Robert A Duff) (2005-10-07)
Re: Why context-free? nmm1@cus.cam.ac.uk (2005-10-08)
Re: Why context-free? vidyut.vidyut@gmail.com (2005-10-08)
Re: Why context-free? gah@ugcs.caltech.edu (glen herrmannsfeldt) (2005-10-09)
Re: Why context-free? mpah@thegreen.co.uk (2005-10-09)
Re: Why context-free? nmm1@cus.cam.ac.uk (2005-10-09)
Re: Why context-free? rfigura@erbse.azagtoth.de (Robert Figura) (2005-10-10)
Re: Why context-free? boldyrev@cgitftp.uiggm.nsc.ru (Ivan Boldyrev) (2005-10-10)
[28 later articles]
| List of all articles for this month |

From: nmm1@cus.cam.ac.uk (Nick Maclaren)
Newsgroups: comp.compilers
Date: 8 Oct 2005 17:26:27 -0400
Organization: University of Cambridge, England
References: 05-10-053 05-10-061
Keywords: design, parse, comment
Posted-Date: 08 Oct 2005 17:26:26 EDT

Robert A Duff <bobduff@shell01.TheWorld.com> wrote:
>
>> I have been thinking about a programming language, and have good
>> reasons to abandon context-free grammars completely.
>
>This portends a delicious language-design discussion, if our esteemed
>moderator allows such (as opposed to pure compiler-specific
>discussions). ;-)


Thanks for all the responses. I shall reply here, as spawning threads
is merely confusing :-)


>I favor context-free grammars because I understand them, both as a user
>of a language, and as a compiler writer.


Yes. One thing that I should have made clear, and clearly didn't, is
that I fully agree that one should design a mathematically clean
grammar and work from that. While there ARE people who can code a
parser and end up with an unambiguous language, I am not one. Nor
were Backus, Kernighan and Ritchie :-)


I am specifically referring to using a grammar that includes context.
One obvious solution is to go in for van Wijngaarden grammars and
related technologies, but I have never found a comprehensible
description. In fact, I am 90% certain that they are out of fashion
for precisely that reason!


>Now, please tell us, why you "have good reasons to abandon" them
>"completely". Tell us the good reasons. And the alternative.


Right. Firstly, I want to bring types into the grammar, in ways that
are similar to Algol 68 but more extensive. For example:


<conditional expression> ::=
        '(' <integer expression> ? <type_1 expression> ':'
                <type_1 expression> ':' <type_1 expression> |
        '(' <boolean expression> ? <type_2 expression> ':'
                <type_2 expression> ')'


might allow three values for integers (negative, zero and positive)
and constrain the value types to be the same. Now, this can clearly
be done using two-level grammars, but my problem with them is as
above. That example can be, and was done, in Algol 68.


Secondly, I can see no good reason to stop programs from extending
the syntax by adding such rules. With tolerable constraints, it is
easy for the compiler to detect the introduction of an ambiguity
(regular expression overlap is a solved problem - or, at least, I
invented an algorithm and could publish if it has not already been
done).


Thirdly, I want to bring some of the scoping rules into the grammar,
for things like exception handling, breaking out of loops etc.
Even Algol 68 tended to use English for some of this, so I clearly
cannot do it completely - but it isn't hard to add extra predicates
to a basic BNF-style grammar.


Note that I am tending to assume a recursive descent parser, but one
where certain, clearly-defined aspects of the context are visible
to the parsing rules. Provided that you ensure that all such context
is visible before it is needed, there is no difficulty - but I remain
amazed by how many languages (INCLUDING Algol 68, Fortran and C) have
missed that obvious point.


>> [The best argument I've heard is that to a first approximation, CFGs
>> match languages that people can understand. Of course, since I write
>> everything in perl these days, I suppose I don't believe that, either.]
>
>Sorry, but I'm not impressed with Perl.


Oh, it's clearly a CFG - Compiles Files of Gibberish. I use Python
because, when I make a mistake, it will usually tell me I am an idiot.


Perl is more forgiving, and will do what I might have meant in another
universe - I checked every single branch and path in my first 24-line
Perl program on artificial data, and it then failed when I first used
it, because I had misread the description and made a syntax error.
I have never used it since when anything else will do, and regard
it as far too dangerous ever to use as superuser.


[ Oh, I was also the person who did a draft port of Perl 4 to MVS
(think K&R,ASCII,Unix => ISO,EBCDIC,MVS), which coloured my view
of it as a piece of software engineering. ]


My intent is to be as far from Perl as possible :-)




Regards,
Nick Maclaren.


[Odd, I never have that problem writing perl applications, and mine
are somewhat longer than 24 lines. Maybe I'm better at reading the
manual, although I admit to staying away from the internal
implementation.


With respect to the prior discussion, I would worry about error
diagnostics and general readability. When you start building types
into the syntax, that means that many type errors now are likely to
produce "syntax error" rather than "string found where boolean
expected", and I have to say your 2 vs. 3 way branch is grosser than
anything I've done in perl. As far as extending the syntax on the
fly, that avenue was extensively investigated in the 1970s in
languages like IMP-72 and EL/1, all of which died. We discovered that
if you can write every program in a slightly different language, six
months later you won't remember what each language was and you won't
be able to read any of them. OOP style type extensions seem to be
reasonably easy to understand, but new and improved ternary operators
and mobius loops are not. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.