28 Jun 2002 18:39:19 -0400

Related articles |
---|

[4 earlier articles] |

Re: regular expression operators in CF grammars rboland@unb.ca (Ralph Boland) (2002-06-17) |

Re: regular expression operators in CF grammars cfc@world.std.com (Chris F Clark) (2002-06-17) |

Re: regular expression operators in CF grammars kgw-news@stiscan.com (2002-06-20) |

Re: regular expression operators in CF grammars kgw-news@stiscan.com (2002-06-28) |

Re: regular expression operators in CF grammars parsersinc@yahoo.com (SLK Parsers) (2002-06-28) |

Re: regular expression operators in CF grammars parsersinc@yahoo.com (SLK Parsers) (2002-06-28) |

Re: regular expression operators in CF grammars soenke.kannapinn@wincor-nixdorf.com (=?Windows-1252?Q?S=F6nke_Kannapinn?=) (2002-06-28) |

Re: regular expression operators in CF grammars cfc@world.std.com (Chris F Clark) (2002-07-02) |

Re: regular expression operators in CF grammars cfc@world.std.com (Chris F Clark) (2002-07-02) |

Re: regular expression operators in CF grammars joachim_d@gmx.de (Joachim Durchholz) (2002-07-02) |

Re: regular expression operators in CF grammars soenke.kannapinn@wincor-nixdorf.com (=?Windows-1252?Q?S=F6nke_Kannapinn?=) (2002-07-04) |

Re: regular expression operators in CF grammars parsersinc@yahoo.com (SLK Parsers) (2002-07-04) |

Re: regular expression operators in CF grammars rboland@unb.ca (Ralph Boland) (2002-07-04) |

[9 later articles] |

From: | "=?Windows-1252?Q?S=F6nke_Kannapinn?=" <soenke.kannapinn@wincor-nixdorf.com> |

Newsgroups: | comp.compilers |

Date: | 28 Jun 2002 18:39:19 -0400 |

Organization: | Siemens Inc. |

References: | 02-05-142 02-05-151 02-06-024 02-06-056 |

Keywords: | parse, bibliography |

Posted-Date: | 28 Jun 2002 18:39:19 EDT |

Chris F Clark writes:

*> That being said, there are some fairly straight-forward algorithms for*

*> implementing EBNF translations directly as part of the parser*

*> generation.*

*> (...)*

*> It "is" more difficult to do so correctly in the LR case, than the LL*

*> case because of the way reductions are handled in LR. However, there*

*> is even a complete set of published algorithms that cover implementing*

*> EBNF in an LR parser written by Nakata and Sassa--not that those*

*> articles are easy to track down nor easy to read.*

Unfortunately, closer analysis of the publications that deal with

extended context-free grammars (ecfgs) or regular right part grammars

(rrpgs) reveals some severe errors, and in my humble opinion, most of

these publications are more or less disappointing. (The only solid

contributions are from Madsen and Kristensen (1976) and Heilbrunner

(1981).) A common problem is a sloppy treatment of theory; and without

a solid theoretical foundation, parser correctness cannot be proved.

Nearly all contributions to ELR parser construction -- namely LaLonde

(1977, 1979), Purdom and Brown (1971), Chapman (1987), Nakata and

Sassa (1984), Shin and Choe (1993), Fortes Gálvez (1994), and Lee and

Kim (1997) -- have a concept of ELR(k)-derivation step in which a lhs

nonterminal is replaced by some expansion of the corresponding rhs

(i.e., a sentence described by the rhs finite automaton or regular

expression) IGNORING how that rhs expansion had been achieved. In

other words: all these authors abstract from the structure of rhs

expansions.

Not only do I personally believe that this abstraction is inadequate

for the purpose of compiler construction. The above mentioned authors

also do not treat that abstraction correctly in their parser

constructions in the light of the results of Heilbrunner (1979). Let

me explain this point briefly.

Instead of using regular expressions or finite automata, Heilbrunner

(1979) uses unambiguous right-linear subgrammars to describe the rhs

expansions so that classical LR theory for cfgs can be applied. He

proves that whenever SOME ecfg with unambiguous right-linear

subgrammars is LR(k) ANY equivalent such ecfg will also be LR(k).

Because of that, construction of provably correct (LA)LR(k) parsers

can be done by simply replacing rhs regular expressions or finite

automata by equivalent unambigous right-linear subgrammars. For

ambiguous rhs regular expressions or finite automata this requires

some sort of subset construction to remove ambiguity. In contrast, the

above mentioned authors all rely on LR(k) item set definitions that

use the unmodified original grammar rules -- and they run into

problems (except Purdom and Brown (1981) and Nakata and Sassa (1986)

who -- unnecessarily -- restrict their approach to deterministic rhs

finite automata; and they have other problems).

Note that subset constructions to remove unambiguity imply, of course,

that a derivation using the transformed grammar cannot be uniquely

interpreted in terms of the structures of the original grammar,

especially in terms of their (ambiguous) rhss.

I personally prefer to call an rrpg ELR(k) iff its straightforward

cfg-transcription using (unambiguous) right-linear subgrammars is

LR(k). And I call an ecfg ELR(k) iff the cfg with (unambiguous)

right-linear subgrammars achieved by transforming the rhs regular

expressions into finite automata with epsilon-moves and then into

right-linear subgrammars is LR(k). (This requires rhs regular

expressions to be "strongly unambiguous" (Brüggemann-Klein 1993).) An

ELR(k) parser theory (such as that of Madsen and Kristensen (1976))

that uses that transformational approach does NOT abstract from rhs

structures and is able to reconstruct the structure of a complex

handle in terms of the corresponding structural rhs ingredients. I

strongly feel that THIS is what ELR parsers should accomplish.

You especially mentioned the approach of Nakata and Sassa (1986) to

ELALR(k) parsing. Their article in Acta Informatica does not even

define a derivation relation; it does not explain where look- aheads

come from; their characterization of kernel items is not

well-defined. And no attempt is made there to prove missing important

basic results they rely on (e.g.: correctness of the item set

construction for the new type of "item"!).

Regards,

Sönke Kannapinn.

Brüggemann-Klein, A. (1993): Regular expressions into finite

automata. TCS 120, pp. 197--213.

Chapman, N.P. (1987): LALR(1,1) parser generation for regular right

part grammars. Acta Inf. 21, pp. 29--45.

Fortes Gálvez, J. (1994): A note on a proposed LALR parser for

extended context-free grammars. Inf. Proc. Letters 50,

pp. 303--305.

Heilbrunner S. (1979): On the definition of ELR(k) and ELL(k)

grammars. Acta Inf. 11, pp. 169--176.

LaLonde, W.R. (1977): Regular right part grammars and their parsers.

CACM 20(10), pp. 731--741.

LaLonde, W.R. (1979): Constructing LR parsers for regular right part

grammars. Acta Inf. 11, pp. 173--193.

Lee, G.-O. and D.-H. Kim (1997): Characterization of extended LR(k)

grammars. Inf. Proc. Letters 64, pp. 75--82.

Madsen, O.L. and B.B. Kristensen (1976): LR-parsing of extended

context free grammars. Acta Inf. 7, pp. 61--73.

Nakata, I. and M. Sassa (1986): Generation of efficient LALR(1)

parsers for regular right part grammars. Acta Inf. 23,

pp. 149--162.

Purdom, P.W. and C.A. Brown (1981): Parsing extended LR(k) grammars.

Acta Inf. 15, pp. 115--127.

Shin, H.-C. and K.-M. Choe (1993): An improved LALR(k) parser

generation for regular right part grammars. Inf. Proc. Letters

47, pp. 123--129.

For a thorough analysis of ELR parsing see my doctoral dissertation

http://edocs.tu-berlin.de/diss/2001/kannapinn_soenke.htm

(in German)

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.