Re: Grammars with semantics

Rob Arthan <rda@lemma-one.com>
27 Apr 2003 02:29:17 -0400

From comp.compilers

Related articles
Grammars with semantics haberg@matematik.su.se (2003-04-20)
*Re: Grammars with semantics rda@lemma-one.com (Rob Arthan)* (2003-04-27)**
Re: Grammars with semantics haberg@matematik.su.se (2003-04-27)
Re: Grammars with semantics haberg@matematik.su.se (2003-05-06)

| List of all articles for this month |

From:	Rob Arthan <rda@lemma-one.com>
Newsgroups:	comp.compilers
Date:	27 Apr 2003 02:29:17 -0400
Organization:	Lemma 1 Ltd.
References:	03-04-067
Keywords:	parse
Posted-Date:	27 Apr 2003 02:29:16 EDT

Hans Aberg wrote:

> Has somebody studied grammars G together with a semantics S in the sense
> below? -- Some common grammars are ambiguous, but even with a universal
> semantics, they may become unambiguous (meaning, different valid parses
> produce the same semantic value).
>
> Take the "calculator" grammar G = (T, N, P, E), terminals T = {i, `+',
> `*', `(', `)'}, nonterminals N = {E, T, F}, sentence symbol E, and the
> set of productions P containing:
> P_1: E -> T
> P_2: E -> E `+' T
> P_3: T -> F
> P_4: T -> T `*' F
> P_5: F -> i
> P_6: F -> `(' F `)'
>
> The expression i + i*i has several parses, for example the leftmost (as in
> LL, top-down, parsing) and rightmost (as in LR, bottom-up parsing):
>...
> So this grammar us ambiguous,

No, it's not. Top-down and bottom-up parsing produce the same parse-trees,
but they construct the nodes in different orders. An ambiguous grammar for
this language is one like:

E -> i | '(' E ')' | | E '*' E | 'E' '+' 'E'

where there are several different ways of parsing say 'x + y * z * w'
corresponding to different ways of bracketing the operators.

[ I think you meant the F in the P_6 to be an E, but it's not important in
your example. ]

Very few grammars are "unambiguous" in the non-standard sense in which you
are using the term (essentially, only those in which the parse-trees
degenerate into lists).

> even though when imposing the condition that
> the rightmost or leftmost derivation should be used, the ambiguity
> disappears: So LL (if removing left recursion) and LR parsers would not
> complain.
>
>...
> Now, when I do that for the grammar G above, I get the same semantic value
> for the expression i + i * i for the leftmost and the rightmost parses,
> namely
> (f_2(f_1(f_3(f_5(x_1))), x_2, f_4(f_3(f_5(x_3)), x_4, f_5(x_5)))) in
> V^1
> if the value of the original string is (x_1, x_2, x_3, x_4, x_5) in V^5.
>
> Thus the funny thing happens that even though there are different parses
> of i + i*i, the "universal" semantic values agree.

The pattern of the applications of the f_i in your example is determined by
the structure of the parse-tree - as this is the same regardless of the
order in which you construct the nodes of the tree, the resulting
synthesized semantic value will be the same. What you're describing as the
"universal semantic value" _is_ the parse-tree.

In an implementation, if the evaluation of the f_i had side-effects, then
those side-effects would occur in a different sequence and that would
change the resulting semantic value.

>
> Now, has somebody written this phenomenon down, with conditions
> identifying semantically unambiguous grammars? References?
>

What you've described amounts to the observations that an unambiguous
grammar (in the usual sense of the term) assigns a unique parse-tree to
every sentence in its language and that evaluation of a side-effect free
function by recursion over the structure of a parse-tree is independent of
the order in which you calculate the values on the subtrees. This doesn't
sound deep enough for anyone to have written it up in those words - perhaps
you have a deeper question in mind that doesn't come over in your example.

Rob Arthan (rda@lemma-one.com)

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Grammars with semantics

Rob Arthan <rda@lemma-one.com>27 Apr 2003 02:29:17 -0400

Rob Arthan <rda@lemma-one.com>
27 Apr 2003 02:29:17 -0400