Related articles |
---|
Grammars with semantics haberg@matematik.su.se (2003-04-20) |
Re: Grammars with semantics rda@lemma-one.com (Rob Arthan) (2003-04-27) |
Re: Grammars with semantics haberg@matematik.su.se (2003-04-27) |
Re: Grammars with semantics haberg@matematik.su.se (2003-05-06) |
From: | haberg@matematik.su.se (Hans Aberg) |
Newsgroups: | comp.compilers |
Date: | 20 Apr 2003 17:37:19 -0400 |
Organization: | Mathematics |
Keywords: | parse, theory, question |
Posted-Date: | 20 Apr 2003 17:37:19 EDT |
Has somebody studied grammars G together with a semantics S in the sense
below? -- Some common grammars are ambiguous, but even with a universal
semantics, they may become unambiguous (meaning, different valid parses
produce the same semantic value).
Take the "calculator" grammar G = (T, N, P, E), terminals T = {i, `+',
`*', `(', `)'}, nonterminals N = {E, T, F}, sentence symbol E, and the
set of productions P containing:
P_1: E -> T
P_2: E -> E `+' T
P_3: T -> F
P_4: T -> T `*' F
P_5: F -> i
P_6: F -> `(' F `)'
The expression i + i*i has several parses, for example the leftmost (as in
LL, top-down, parsing) and rightmost (as in LR, bottom-up parsing):
E E
-> E + T -> E + T
-> T + T -> E + T * F
-> F + T -> E + T * i
-> i + T -> E + F * i
-> i + T * F -> E + i * i
-> i + F * F -> T + i * i
-> i + i * F -> F + i * i
-> i + i * i -> i + i * i
So this grammar us ambiguous, even though when imposing the condition that
the rightmost or leftmost derivation should be used, the ambiguity
disappears: So LL (if removing left recursion) and LR parsers would not
complain.
Now attach a semantics: There is a set of semantic values V, and each production
P_j: a_1 ... a_m -> b_1 ... b_n
is assumed to have an associated function
f_j: V^n -> V^m.
Each string s = s_1 ... s_n, where s_i in T union N, is successively
assigned values in V^n by using these functions f_j: Each terminal is
assumed to have a value. Then, in a derivation Z => s, when a production
P_j is applied, one can get a new value by the use of the function f_j on
the changed components to successively work towards the sentence symbol Z.
It then gets a value in V^1 = V, which is defined to be the semantic value
of the derivation Z => s then. If different derivations Z => s yield the
same semantic values, then this can be used as a semantic value of s.
Now, when I do that for the grammar G above, I get the same semantic value
for the expression i + i * i for the leftmost and the rightmost parses,
namely
(f_2(f_1(f_3(f_5(x_1))), x_2, f_4(f_3(f_5(x_3)), x_4, f_5(x_5)))) in V^1
if the value of the original string is (x_1, x_2, x_3, x_4, x_5) in V^5.
Thus the funny thing happens that even though there are different parses
of i + i*i, the "universal" semantic values agree.
Now, has somebody written this phenomenon down, with conditions
identifying semantically unambiguous grammars? References?
Hans Aberg * Anti-spam: remove "remove." from email address.
* Email: Hans Aberg <remove.haberg@member.ams.org>
* Home Page: <http://www.math.su.se/~haberg/>
* AMS member listing: <http://www.ams.org/cml/>
Return to the
comp.compilers page.
Search the
comp.compilers archives again.