Re: LALR parsing

Torben AEgidius Mogensen <torbenm@diku.dk>Mon, 07 Dec 2009 11:43:20 +0100

From comp.compilers

Related articles
LALR parsing alinsoar@voila.fr (A. Soare) (2009-12-04)
Re: LALR parsing torbenm@diku.dk (Torben AEgidius Mogensen) (2009-12-07)
Re: LALR parsing alinsoar@voila.fr (A. Soare) (2009-12-09)
Re: LALR parsing Danny.Dube@ift.ulaval.ca (2009-12-09)
Re: LALR parsing torbenm@diku.dk (2009-12-10)
Re: LALR parsing rsc@swtch.com (Russ Cox) (2009-12-11)
Re: LALR parsing torbenm@diku.dk (2009-12-14)
Re: LALR parsing ott@mirix.org (Matthias-Christian ott) (2009-12-14)
[1 later articles]
| List of all articles for this month |

 From: Torben AEgidius Mogensen Newsgroups: comp.compilers Date: Mon, 07 Dec 2009 11:43:20 +0100 Organization: Department of Computer Science, University of Copenhagen References: 09-12-007 Keywords: LALR Posted-Date: 08 Dec 2009 21:36:09 EST

"A. Soare" <alinsoar@voila.fr> writes:

> I want to implement a LALR parser for an arbitrary grammar (many
> common point to yacc, however different).
>
> However, I did read that these methods are not the most fast. The
> fastest methods use matrix multiplications, etc. In the book of Grune
> is it explained something about the equalence between parsing and
> matrix multiplication, but I see nothing useful apart of some vagues
> ideas.
>
> My question for you is: what are the most efficient methods of lalr parsing?

LALR(1) defined a class of grammars that can be parsed with a
deterministic LALR(1) parser. These can be parsed in something
approaching linear time. But LALR(1) can not deterministically parse
arbitrary grammars. You can add backtracking to allow parsing of
arbitrary grammars, but that can in the worst case give exponential
runtimes. GLR parsers (see http://en.wikipedia.org/wiki/GLR_parser)
handle the multiple parses in a breath-first manner instead of a
depth-first backtracking, so they can merge identical states. This
reduces the worst case to O(n^3).

While the matrix-multiplication methods you refer to have better
worst-case performance, GLR will outperform them in most cases. Matrix
multiplication can be done in O(n^2.376), but the algorithm for this has
faster except for extremely large matrices. So I doubt the O(n^2.376)
algorithm is of any practical use for parsing.

Note, however, that while it has been proven that the complexity of
parsing is bounded by the complexity of matrix multiplication, the
converse is not true: There may well be general parsers that are faster
than matrix multiplication. IIRC, the lower bound for CF parsing is
still O(n).

Torben

Post a followup to this message