Re: JavaCUP and actions

Chris F Clark <cfc@shell01.TheWorld.com>
28 Apr 2005 14:56:26 -0400

          From comp.compilers

Related articles
JavaCUP and actions stofte@gmail.com (Svend Tofte) (2005-04-26)
Re: JavaCUP and actions cdodd@acm.org (Chris Dodd) (2005-04-28)
Re: JavaCUP and actions cfc@shell01.TheWorld.com (Chris F Clark) (2005-04-28)
| List of all articles for this month |
From: Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups: comp.compilers
Date: 28 Apr 2005 14:56:26 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 05-04-069
Keywords: parse, LALR
Posted-Date: 28 Apr 2005 14:56:26 EDT

Svend Tofte asked an excellent question about delaying actions, with
the following grammar fragment:


E ::= {: System.out.print("<E>"); :}
                  E
                  PLUS:p
            {: System.out.print(p); :}
                  T
            {: System.out.print("</E>"); :}


It is possible to delay the actions, but it is actually quite tricky
to do so, and your grammar fragment illustrates why. When parsing an
E, the grammar wants to immediately print out a message "<E>".
However, note that it is going to immediately make a left-recursive
call to the E rule. So, it may need to print out another "<E>" also.
In fact, it may need to print out an unbounded number of "<E>"s at the
start of an E rule, as many as it needs to dive into nested E rules to
satisfy the left recursion.


BTW, this is why the grammar isn't LL(k). An LL parser generator,
needs to know by looking at the first k tokens of a rule whether it
applies or not. It cannot left recurse, because if it did, you get
this unbounded guessing problem.


In contrast LR parser generators, push the necessary information onto
the stack, and ignore the left recursion issue, knowing that some k
number of tokens after the recursive rule has completed, the parser
can figure out whether it recursed or not. However, any actions that
occur before the end of the rule, force the parser generator to decide
early and may cause conflicts and generally will if the rule is left
recursive. (See the moderators note about the typical implementation
strategy. It isn't the only one, but all strategies have roughly
equivalent issues.)


Hence, the moderator's suggestion, put any actions at the end of the
rule. This allows the LR parser generator to avoid the conflicts.
and an even better strategy is to build an AST and do the actions as a
walk of the AST. It moves your actions from being online to offline,
but it makes many things much simpler.


If you really need an online parse, you actually might be happier with
a top-down (LL) tool. The strictures of LL require you to write your
grammar such that it can be processed online. If you use an LR tool,
the advantage in left recursion (and thus more natural to my mind
grammars) is inherently due to its offline nature. Any place an LR
grammar is not also LL, is a place where there is potentially the
ability to look past an unbounded number of tokens before knowing what
to parse, which is essentially a requirement to do the parse offline.


Hope this helps,
-Chris


*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)
------------------------------------------------------------------------------


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.