Grammars with semantics (Hans Aberg)
20 Apr 2003 17:37:19 -0400

          From comp.compilers

Related articles
Grammars with semantics (2003-04-20)
Re: Grammars with semantics (Rob Arthan) (2003-04-27)
Re: Grammars with semantics (2003-04-27)
Re: Grammars with semantics (2003-05-06)
| List of all articles for this month |

From: (Hans Aberg)
Newsgroups: comp.compilers
Date: 20 Apr 2003 17:37:19 -0400
Organization: Mathematics
Keywords: parse, theory, question
Posted-Date: 20 Apr 2003 17:37:19 EDT

Has somebody studied grammars G together with a semantics S in the sense
below? -- Some common grammars are ambiguous, but even with a universal
semantics, they may become unambiguous (meaning, different valid parses
produce the same semantic value).

Take the "calculator" grammar G = (T, N, P, E), terminals T = {i, `+',
`*', `(', `)'}, nonterminals N = {E, T, F}, sentence symbol E, and the
set of productions P containing:
        P_1: E -> T
        P_2: E -> E `+' T
        P_3: T -> F
        P_4: T -> T `*' F
        P_5: F -> i
        P_6: F -> `(' F `)'

The expression i + i*i has several parses, for example the leftmost (as in
LL, top-down, parsing) and rightmost (as in LR, bottom-up parsing):
        E E
        -> E + T -> E + T
        -> T + T -> E + T * F
        -> F + T -> E + T * i
        -> i + T -> E + F * i
        -> i + T * F -> E + i * i
        -> i + F * F -> T + i * i
        -> i + i * F -> F + i * i
        -> i + i * i -> i + i * i
So this grammar us ambiguous, even though when imposing the condition that
the rightmost or leftmost derivation should be used, the ambiguity
disappears: So LL (if removing left recursion) and LR parsers would not

Now attach a semantics: There is a set of semantic values V, and each production
        P_j: a_1 ... a_m -> b_1 ... b_n
is assumed to have an associated function
        f_j: V^n -> V^m.
Each string s = s_1 ... s_n, where s_i in T union N, is successively
assigned values in V^n by using these functions f_j: Each terminal is
assumed to have a value. Then, in a derivation Z => s, when a production
P_j is applied, one can get a new value by the use of the function f_j on
the changed components to successively work towards the sentence symbol Z.
It then gets a value in V^1 = V, which is defined to be the semantic value
of the derivation Z => s then. If different derivations Z => s yield the
same semantic values, then this can be used as a semantic value of s.

Now, when I do that for the grammar G above, I get the same semantic value
for the expression i + i * i for the leftmost and the rightmost parses,
        (f_2(f_1(f_3(f_5(x_1))), x_2, f_4(f_3(f_5(x_3)), x_4, f_5(x_5)))) in V^1
if the value of the original string is (x_1, x_2, x_3, x_4, x_5) in V^5.

Thus the funny thing happens that even though there are different parses
of i + i*i, the "universal" semantic values agree.

Now, has somebody written this phenomenon down, with conditions
identifying semantically unambiguous grammars? References?

    Hans Aberg * Anti-spam: remove "remove." from email address.
                                    * Email: Hans Aberg <>
                                    * Home Page: <>
                                    * AMS member listing: <>

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.