Re: Trouble understanding LL(k) vs. LALR(k) etc.

Chris F Clark <cfc@shell01.TheWorld.com>
21 Apr 2004 00:45:25 -0400

From comp.compilers

Related articles
[4 earlier articles]
Re: Trouble understanding LL(k) vs. LALR(k) etc. derkgwen@HotPOP.com (Derk Gwen) (2004-03-26)
Re: Trouble understanding LL(k) vs. LALR(k) etc. cfc@shell01.TheWorld.com (Chris F Clark) (2004-03-26)
Re: Trouble understanding LL(k) vs. LALR(k) etc. f40mczf02@sneakemail.com (Karsten Nyblad) (2004-04-03)
Re: Trouble understanding LL(k) vs. LALR(k) etc. clint@0lsen.net (Clint Olsen) (2004-04-14)
Re: Trouble understanding LL(k) vs. LALR(k) etc. cfc@shell01.TheWorld.com (Chris F Clark) (2004-04-15)
Re: Trouble understanding LL(k) vs. LALR(k) etc. clint@0lsen.net (Clint Olsen) (2004-04-15)
*Re: Trouble understanding LL(k) vs. LALR(k) etc. cfc@shell01.TheWorld.com (Chris F Clark)* (2004-04-21)**

| List of all articles for this month |

From:	Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups:	comp.compilers
Date:	21 Apr 2004 00:45:25 -0400
Organization:	The World Public Access UNIX, Brookline, MA
References:	04-03-042 04-03-057 04-03-072 04-03-091 04-04-029 04-04-049 04-04-054
Keywords:	parse
Posted-Date:	21 Apr 2004 00:45:25 EDT

Another follow-up question from Clint Olsen:
> Thanks for the detailed explanation. I did have a question about the
> stack though. It seems in this case your stack would grow to
> accomodate the entire match - similar to using right recursion in
> traditional yacc. Do you have issues with extremely deep stack growth
> using this technique? Imagine a case where you could have hundreds of
> identifiers separated by commas. You'd have to snarf all of that
> before calling the rule.

Yes, the stack does grow to accomodate the entire match, just as if
one used right recursion. In today's 32 bit (or wider) world, deep
stack growth is not an issue. Even in the 16 bit world, it wasn't a
serious issue, because the stack will grow if it ever fills up and 65k
items is still relatively generous.

In early versions of Yacc++ (especially the ones servicing DOS), it
was somewhat more of an issue. The default stack size was 8k and some
users used to select even a smaller size (and in the early versions
the upper bound was fixed and did not grow once set).

However, we put features in to mitigate that problem (and they are
useful even when stack size isn't a problem). The most important
feature was that tokens (and non-terminals) could be declared "remove",
in which case shifting them did not add them to the parser stack, but
instead dropped the semantic value on the floor (or deleted it).

So, in the case of a comma separated list, one might choose to remove
the commas and then only the items being separated would appear on the
stack. That will cut the number of entries in half. More
importantly, those entries also will be eliminated from when you are
iterating, which means one can save time as well as space.

Another strategy we recommended was removing the top-most items,
assuming that one would handle their semantics at the rule where they
were defined and not after swallowing them into a list. While not
universally true, this model is particularly good for "servers" when
each request is a top-level item that is serviced immediately after
being parsed. In that model, the top level list is potentially
infinite and one doesn't ever want or care about the list as a whole.

But back to the point, our experience has been that snarfing all the
objects into an array is sufficiently efficient that it compensates
for the fact that you have to have all the objects to do so. In
general, one needs all the objects around (in virtual memory) anyway.

The overhead of the array is 1 pointer per entry, where as a linked
list requires 2 pointers per entry, unless one puts the next field
into the object (and then one has extra next fields for singleton
lists, which is it's own problem). However, "your mileage may vary"
and for some projects our choices are not optimal. The time to
determine that is in the tuning phase. Most of the time, our choice
will be good enough (and in fact, so would have been a linked list
with either left or right recursion). Those times where it isn't good
enough, one needs to measure, experiment, and re-measure, perhaps
several times, to find the right choice. And, it is in those cases
where we feel our product really shines, because you can choose how
you want things implemented--an array, a linked list, using a regular
expression, or using recursion, and several other options. The casual
user might not appreciate (or even notice) all the flexibility we
tried to build into our tool, but it is there for the person who needs
it.

Hope this helps,
-Chris

*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Trouble understanding LL(k) vs. LALR(k) etc.

Chris F Clark <cfc@shell01.TheWorld.com>21 Apr 2004 00:45:25 -0400

Chris F Clark <cfc@shell01.TheWorld.com>
21 Apr 2004 00:45:25 -0400