Related articles |
---|
BNF notation for Delphi Grammar gustgr@gmail.com (2005-05-31) |
Re: BNF notation for Delphi Grammar haberg@math.su.se (2005-06-02) |
Re: BNF notation for Delphi Grammar gneuner2@comcast.net (George Neuner) (2005-06-06) |
Re: BNF notation for Delphi Grammar DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-06-08) |
Re: BNF notation for Delphi Grammar franck.pissotte@alussinan.org (Franck Pissotte) (2005-06-08) |
Re: BNF notation for Delphi Grammar gneuner2@comcast.net (George Neuner) (2005-06-08) |
Re: BNF notation for Delphi Grammar frank@g-n-u.de (2005-06-12) |
From: | frank@g-n-u.de |
Newsgroups: | comp.compilers |
Date: | 12 Jun 2005 21:19:36 -0400 |
Organization: | G-N-U GmbH |
References: | 05-05-230 05-06-037 05-06-042 |
Keywords: | Pascal, parse |
Posted-Date: | 12 Jun 2005 21:19:36 EDT |
Hans-Peter Diettrich <DrDiettrich@compuserve.de> wrote:
> George Neuner wrote:
>
>> The Borland Pascal's used to contain the language BNF in the manual
>> appendixes. Did they discontinue this practice with Delphi?
>
> The practice is continued, but the grammar is neither complete nor valid
> nor up-to-date since the introduction of case-else.
>
> I found railroad diagrams supplied with Delphi 2, that already differed
> from what the compiler accepts. Later versions come with some kind of
> extended BNF syntax, with the same problems and a missing description
> for the grammar syntax or semantics.
>
> Some people have tried to construct EBNF grammars for various Delphi
> versions in the past, but the results are questionable, for several
> reasons:
>
> - "directives" are reserved words in specific context.
> - the semantics are too far away from the syntax, in detail
> - semicolons can have unexpected (context sensitive) effects.
>
> My favorite example:
>
> case i of
> 0: if a then b
> ; //<----- illegal, optional or required?
> else c
> end;
>
> Normally a semicolon is illegal in this place, because it's intended to
> only separate multiple case-labels. In this special case the marked
> semicolon is not optional, in fact it indicates whether the following
> dangling "else" is part of the "if" or of the "case" statement.
A nasty feature indeed, but more to the programmer who can get
confused -- it's similar to the dangling else problem, yet a bit
different, so one has to pay attention. BTW, this feature already
existed in Turbo Pascal, while standard Pascal uses `otherwise', not
`else' in case-statements, to avoid this confusion.
I just checked the manual (German version of Borland Pascal 7.0,
though I doubt it makes a difference). The diagrams are clearly
wrong indeed as they don't even allow that semicolon, though the
compiler seems to behave correctly (i.e., as you describe) in this
regard.
> Now try to construct an according context-free grammar :-(
Not very difficult actually. The following is an extract from the
corresponding rules in GNU Pascal. We use Bison precedence to avoid
the dangling-else S/R conflict in if-statements, and this (standard)
trick also resolves this problem in case-statements: If there's no
semicolon before the `else', it's bound to the if-statement due to
precedence; if there's one, the `optional_semicolon' before the
case-statement's `else' is the only way to parse it, so that's
what's done.
BTW, precedence rules of course don't affect the grammar being CFG
(if you like, you could expand it to roughly twice as many rules
without an S/R conflict). So while it may appear context-sensitive
to you intuitively, it isn't actually, as it's still localized such
that CFG rules can deal with it.
Note, the grammar handles Borland's `else' and standard Pascal's
`otherwise' together. Keywords are denoted `p_foo' in this grammar.
%nonassoc prec_if
%nonassoc p_else
%%
statement:
if_then %prec prec_if
| if_then p_else optional_statement
| p_case expression p_of optional_case_element_list optional_case_completer p_end
;
if_then:
p_if expression p_then optional_statement
;
optional_case_element_list:
/* empty */
| case_element_list optional_semicolon
;
case_element_list:
case_element
| case_element_list ';' case_element
;
case_element:
expression ':' optional_statement
;
optional_semicolon:
/* empty */
| ';'
;
optional_case_completer:
/* empty */
| otherwise statement_sequence
;
otherwise:
p_else
| p_otherwise
;
statement_sequence:
optional_statement
| statement_sequence ';' optional_statement
;
optional_statement:
/* empty */
| statement
;
So that's actually a rather harmless problem from the grammar
perspective. Which doesn't mean that Borland's syntax extensions
aren't evil, of course. The most obnoxious extension are character
constants of the form `^c' (meaning Ctrl-C, i.e. Chr (3)). They
conflict heavily with the normal use of `^' in Pascal, and even
Borland's own compilers can't handle them in many situations. (I
think GPC does quite a bit better than BP by now, though we still
don't handle all cases, and some are just to silly to consider
seriously.)
This reminds me of the recent thread about hand-written vs.
generated parsers. Apparently, Borland employed a hand-written RD
parser in Turbo Pascal, so they didn't even notice the conflicts
they were creating. They handled these beasts in some common
situations they had in mind, apparently, and missed a lot of other
situations where expressions are also valid (and thus, such
constants should have been valid), where there would be serious
conflicts. If they had used a parser generator, I suppose they would
have noticed in time, before producing such a rubbish feature. Back
to the original question, AFAICS, their diagrams (for BP, don't know
about Delphi) don't even mention these beasts, leave alone tell
where they are accepted and where not. The same may be true for
other obscure features and/or restrictions (though there's nothing
that ever gets close to this).
Frank
--
Frank Heckenbach, frank@g-n-u.de, http://fjf.gnu.de/
GnuPG and PGP keys: http://fjf.gnu.de/plan (7977168E)
Pascal code, BP CRT bugfix: http://fjf.gnu.de/programs.html
Free GNU Pascal Compiler: http://www.gnu-pascal.de/
Return to the
comp.compilers page.
Search the
comp.compilers archives again.