Technical question about yacc and LALR parsers: my answer

jar@florida.HQ.Ileaf.COM (Jim Roskind x5570)
Mon, 22 Apr 91 16:13:24 EDT

          From comp.compilers

Related articles
Technical question about yacc and LALR parsers: my answer jar@florida.HQ.Ileaf.COM (1991-04-22)
| List of all articles for this month |
Newsgroups: comp.compilers
From: jar@florida.HQ.Ileaf.COM (Jim Roskind x5570)
Keywords: yacc, design
Organization: Compilers Central
References: <9104210204.AA07587@florida.HQ.Ileaf.COM>
Date: Mon, 22 Apr 91 16:13:24 EDT

I asked why (if indeed it is true) the token (terminal or non-terminal) to
the left of the carrot was fixed for each state in an LR parser. After
thinking about it some more, I realized the answer, and saw some other
interesting things.


Yes, the token is fixed. The following paragraph is the proof (by
contradiction):


Assume some state had two distinct tokens to the left of the carrot. We
would never be able to isolate a single specific rule to reduce the stack
through this state to a lower point on the stack! Basically, the stack
could never get smaller than the point at which this evil state was
entered, *unless* we were willing to do a reduction that had two
possibilities for a token at the corresponding evil point in the rule.
When two rules differ in their right hand side, they are distinct rules.
If the LR parser can't tell us which one it used, then it is not actually
an LR parser, it is more of an "LR almost parser" (i.e., it almost parses
a sentence, but it doesn't uniquely label all of the productions).


...anyway, it seemed interesting to me, ... and I hope the people who
pondered it also found it interesting.


>From a practical standpoint, an "LR almost parser" is actually useful when
the action code provided for some set of rules is identical. One common
practical example of rules with "identical" code, that can be isolated by
a parser generator, are rules with *no* action code. As a result, a state
that reduced one of several rules (all of which had either no code, or
identical code) could be used to reduce below an evil state. A second
example would would be (again verifiable by the parser generator) when a
"grammar preprocessor" back-substituted some rules and in doing so,
replicated the code actions.


For a long time I have been wondering about parser optimization that can
be based on "common action code", as well as "missing action code." An
interesting question (yeah, I know, I think a lot of questions are
interesting) is how much improvement could be seen my merging states and
generating "almost parsers."


Jim Roskind- Author of a YACCable C++ grammar
Independent Consultant
(407)729-4348 or (617)290-4990 x5570
jar@hq.ileaf.com or ...!uunet!leafusa!jar
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.