Semantic actions in LR parser

xorian@solomon.technet.sg (Xorian Technologies)
Thu, 15 Apr 1993 09:35:59 GMT

          From comp.compilers

Related articles
Semantic actions in LR parser cosc19py@jane.uh.edu (1993-04-02)
Re: Semantic actions in LR parser roy@prism.gatech.edu (1993-04-02)
Re: Semantic actions in LR parser karsten@tfl.dk (1993-04-03)
Re: Semantic actions in LR parser mauney@csljon.csl.ncsu.edu (1993-04-05)
Semantic actions in LR parser clark@zk3.dec.com (1993-04-06)
Re: Semantic actions in LR parser wand@ccs.northeastern.edu (1993-04-07)
Semantic actions in LR parser clark@zk3.dec.com (1993-04-14)
Semantic actions in LR parser xorian@solomon.technet.sg (1993-04-15)
| List of all articles for this month |
Newsgroups: comp.compilers
From: xorian@solomon.technet.sg (Xorian Technologies)
Keywords: parse, LR(1)
Organization: Compilers Central
References: 93-04-008 93-04-010
Date: Thu, 15 Apr 1993 09:35:59 GMT

roy@prism.gatech.edu (Roy Mongiovi) writes:
>LR parsers can only perform semantic actions when the recognize a handle
>(right-hand side). You can either split up the right-hand sides into
>pieces so that the pieces end where you need the semantic actions, or you
>can stick in epsilon productions whose only purpose is to cause semantic
>actions.


karsten@tfl.dk (Karsten Nyblad) writes:
>Even that can be generalized. All items of the kernel of a state has the
>same symbol before the . of the item, where the . denotes the point until
>which the parser has accepted the symbols of the production of the item.
>If the same action has been specified on all productions of the items of
>the kernel of a state on pushing the symbol before the ., then that action
>can executed by the parser.


The principal problem with the above analysis is that it does not address
the grammar from the point of view of the language developer. In
particular, language developers would prefer not to think in terms of
states and items much less their manipulation.


Fortunately, in practice, the problem is usually not nearly so arbitrary
as the above discussion would imply (though, certianly in theory, it can
be). In fact, the problems associated with intermediate parser actions
(namely reduce-reduce conflicts) are more often than not *introduced* by
the traditional grammar formalism.


As I am certain Chris Clark is aware, there is a more natural approach to
grammar specification which better allows for intermediate actions in
productions: regular expressions. By regular expressions, though, I don't
just mean a preprocessor which converts regular expression grammars into
fixed expression grammars. That is what causes the problems in the first
place. Instead, to work, the regular expression must be taken right down
into the parser engine.


The result of using regular expressions to specify grammars is that
seemingly different productions are coallessed into a single production.
The implication is obvious: fewer items and, therefore, less chance of
conflicts associated with intermediate actions. Of course, this requires
that a single action be specified for the new single production but that
was already a requirement as previously discussed. In practice, even naive
users tend to write regular expression grammars that avoid the
redundencies (and intermediate reductions) of fixed expression grammars
and are therefore more suitable for intermediate actions and the
availability of regular expressions provide a more natural mechanism for
the resolution of conflicts that might arise.


The essential problem is that an intermediate action, no matter how it is
implemented, introduces a requirement for the parser to decide what
production it is working on. This is inherently contrary to the nature of
LR parsing which seeks to defer the decision until reduction. Regular
expression grammars avoid this delimna by eliminating many shift-reduce
choices by converting them into paths in the regular expression.


To see just how far this can be taken, consider your favorite grammar.
Rewrite (top to bottom or vice versa) by making macro substitutions by
hand to bring productions together. The only circumstance where
productions cannot be so merged is when a nonterminal is used more than
once in a nontrivial construct. For example, most of the productions of a
ANSI C expression can be brought together into a single regular expression
of nested repetitions. Similarly, much of the complexity of declarations
can be removed.


None of this, of course, is new in theory but there are few production
quality compiler-compilers which support regular expressions. One such
product is LADE which not only allows regular expressions but provides
considerable semantic support for them so that you don't have to pick
through the parse stack trying to figure out, for example, whether or not
an optional construct occured.


I strongly recommend that anyone interested in the issue of attaching
intermediate actions to productions look into regular expression grammars
and their facility for supporting, at the user level, what is needed at
the machine level.


(For more information on LADE, email: xorian@solomon.technet.sg or fax:
(65)253-7709.)
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.