Is the dangling else a syntax bug?

vbdis@aol.com (VBDis)
28 Jun 2001 23:49:56 -0400

          From comp.compilers

Related articles
Is the dangling else a syntax bug? vbdis@aol.com (2001-06-28)
Re: Is the dangling else a syntax bug? ralph@inputplus.demon.co.uk (2001-07-02)
Re: Is the dangling else a syntax bug? lhp+news@toft-hp.dk (2001-07-03)
Re: Is the dangling else a syntax bug? vbdis@aol.com (2001-07-03)
Re: Is the dangling else a syntax bug? ralph@inputplus.demon.co.uk (2001-07-06)
Re: Is the dangling else a syntax bug? ralph@inputplus.demon.co.uk (2001-07-06)
Re: Is the dangling else a syntax bug? gsc@zip.com.au (Sean Case) (2001-07-06)
[19 later articles]
| List of all articles for this month |

From: vbdis@aol.com (VBDis)
Newsgroups: comp.compilers
Date: 28 Jun 2001 23:49:56 -0400
Organization: AOL Bertelsmann Online GmbH & Co. KG http://www.germany.aol.com
Keywords: parse, question
Posted-Date: 28 Jun 2001 23:49:56 EDT

In a discussion, about statement separators vs. terminators, I found
the C syntax being a strange mixture of terminated and unterminated
statements. E.g. the IF statement has no separator, so we find a
"dangling else" in:


    if (a) if (b) stmt1; else stmt2;


Here I intentionally added the semicolons to the statements, to make the next
examples more obvious. When the IF statement had it's own separator, the above
example would produce an syntax error, and the only allowed forms were:


    if (a) if (b) stmt1; ; else stmt2; ;
or
    if (a) if (b) stmt1; else stmt2; ; ;


where the added semicolons properly terminate the two IF statements.


Of course such a language is unreadable, but the example immediately makes
sense, when we rewrite it as:


    if a then if b then stmt1 endif else stmt2 endif
or
    if a then if b then stmt1 else stmt2 endif endif


In a shorter form, which doesn't require so much typing, we could rewrite this
as:


    if a { if b { stmt1 } else stmt2 }
or
    if a { if b { stmt1 else stmt2 } }


Here two abbreviations have been used. First the superfluous
parentheses morphed into the braces around the statements, and the
semicolons have been removed from the statements.


In a third step, the statements now can become statement lists as
well, where the statements are not /terminated/ by semicolons, but
instead are /separated/ from each other. This means, that most blocks
become obsolete now, because in this new syntax no difference exists
between single statements and lists of statements.


It was quite surprising to me, that lists with separated items are the
more natural and economic approach, in contrast to the commonly used
statement terminators!


Lists are used in almost every grammar, e.g. as parameter lists, so
why not reuse this construct in all other places, as far as
applicable?


IMO it's also easier to parse lists, where every occurence of the
unique separator results in another iteration, and every other token
terminates the list?


When different terminators are used, as far as required at all, then
backtracking after syntactical errors also should be easier?


BTW, I know that here I didn't invent anything new ;-)


Besides perhaps the initial question, whether mixing terminated and
unterminated items (statements) in the same grammar is a violation of
basic design principles? Does anybody see a relationship between such
a mix, and the classification of grammars?


DoDi


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.