Re: Tomita parsing

Chris F Clark <cfc@world.std.com>
14 Dec 2003 22:09:57 -0500

From comp.compilers

Related articles
Tomita parsing nospam@nospam.nospam.nospam (David Fisher) (2003-12-13)
*Re: Tomita parsing cfc@world.std.com (Chris F Clark)* (2003-12-14)**

| List of all articles for this month |

From:	Chris F Clark <cfc@world.std.com>
Newsgroups:	comp.compilers
Date:	14 Dec 2003 22:09:57 -0500
Organization:	The World Public Access UNIX, Brookline, MA
References:	03-12-089
Keywords:	parse
Posted-Date:	14 Dec 2003 22:09:56 EST

David F asked:
> I read a comment somewhere about Tomita parsing being ineffective
> (ie. slow) for large languages, but Andi Kleen mentioned in passing
> (in message "Re: Precedence based parsing") that new versions of
> Bison use this algorithm ... can someone please clarify this ?

It isn't large languages per se that is the problem. The problem is
the amount of ambiguity and the size of the inputs that the ambiguity
can be expected to persist over. Highly ambiguous grammars are
problematic, because the parse stops being linear in the size of the
input, approaching either n**2 or n**3 and for non-trivial values of n
that grows too fast for practical parsing. However, most grammars
aren't ambiguous in that manner (over the enitre input, but merely
ambiguous over limited fragments), so that the growth is not an issue.

As far as I know, new versions of Bison have the *option* of using
this algorithm. I believe you can still use Bison to generate
traditional LALR(1) parsers. I would be surprised to hear otherwise.

Hope this helps,
-Chris

*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
19 Bronte Way #33M voice : (508) 435-5016
Marlboro, MA 01752 USA fax : (508) 251-2347 (24 hours)

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Tomita parsing

Chris F Clark <cfc@world.std.com>14 Dec 2003 22:09:57 -0500

Chris F Clark <cfc@world.std.com>
14 Dec 2003 22:09:57 -0500