Re: regular expression operators in CF grammars

"Chris F Clark" <cfc@world.std.com>
2 Jul 2002 01:13:01 -0400

From comp.compilers

Related articles
[5 earlier articles]
Re: regular expression operators in CF grammars cfc@world.std.com (Chris F Clark) (2002-06-17)
Re: regular expression operators in CF grammars kgw-news@stiscan.com (2002-06-20)
Re: regular expression operators in CF grammars kgw-news@stiscan.com (2002-06-28)
Re: regular expression operators in CF grammars parsersinc@yahoo.com (SLK Parsers) (2002-06-28)
Re: regular expression operators in CF grammars parsersinc@yahoo.com (SLK Parsers) (2002-06-28)
Re: regular expression operators in CF grammars soenke.kannapinn@wincor-nixdorf.com (=?Windows-1252?Q?S=F6nke_Kannapinn?=) (2002-06-28)
*Re: regular expression operators in CF grammars cfc@world.std.com (Chris F Clark)* (2002-07-02)**
Re: regular expression operators in CF grammars cfc@world.std.com (Chris F Clark) (2002-07-02)
Re: regular expression operators in CF grammars joachim_d@gmx.de (Joachim Durchholz) (2002-07-02)
Re: regular expression operators in CF grammars soenke.kannapinn@wincor-nixdorf.com (=?Windows-1252?Q?S=F6nke_Kannapinn?=) (2002-07-04)
Re: regular expression operators in CF grammars parsersinc@yahoo.com (SLK Parsers) (2002-07-04)
Re: regular expression operators in CF grammars rboland@unb.ca (Ralph Boland) (2002-07-04)
Re: regular expression operators in CF grammars cfc@world.std.com (Chris F Clark) (2002-07-15)
[8 later articles]

| List of all articles for this month |

From:	"Chris F Clark" <cfc@world.std.com>
Newsgroups:	comp.compilers
Date:	2 Jul 2002 01:13:01 -0400
Organization:	Compilers Central
References:	02-05-142 02-05-151 02-06-024 02-06-056 02-06-090
Keywords:	parse
Posted-Date:	02 Jul 2002 01:13:01 EDT

Sönke Kannapinn writes:
> Nearly all contributions to ELR parser construction ... have a
> concept of ELR(k)-derivation step in which a lhs nonterminal is
> replaced by some expansion of the corresponding rhs (i.e., a
> sentence described by the rhs finite automaton or regular
> expression) IGNORING how that rhs expansion had been achieved. In
> other words: all these authors abstract from the structure of rhs
> expansions.

Yes, in other words, a (a*) = a+ = (a*) a = (a a?) a* = ....
Unfortunately, this is consistent with the standard treatment of
regular expressions. In fact, one might consider it to be an
essential part of the theory of regular expressions.

> For ambiguous rhs regular expressions or finite automata this
> requires some sort of subset construction to remove ambiguity.

And, I believe one could argue that the LR item construction algorithm
does this subset construction if one *properly* ignores the ambiguity.
I will agree that there are some subtle points in doing this
correctly.

> Note that subset constructions to remove unambiguity imply, of course,
> that a derivation using the transformed grammar cannot be uniquely
> interpreted in terms of the structures of the original grammar,
> especially in terms of their (ambiguous) rhss.

If you mean one cannot trivially reconstruct where the different
regular expression operators apply in an expression that uses the
ambiguously (e.g. a* a*, which a's go with which *), I will concur.

There are several possibilities for dealing with that problem. One of
the simplest, is not attempting to distinguish where the constituents
come from, after all the constituents do *not* represent
non-terminals, but mearly fragments of the rhs definition of a
non-terminal. As I said, it is possible to consider the erasure of
such boundaries a fundamental part of regular expressions.

More importantly, the erasure of the boundaries is important for
regular expressions to have the ability to decrease look-ahead
requirements. If the parser generator must preserve the boundaries
where the various regular expression operators apply (preserving the
structure of the rhs), then the parser generator needs to be able to
distunguish those boundary points, which means the parser generator
must do some form of "reduction" or similar demarcation at those
boundary points. Thus, a structure preserving ELR parser generator is
no more powerful than a LR parser generator where the regular
expressions have been turned into hidden non-terminals, because it
must do reductions at the same points.

> An ELR(k) parser theory (such as that of Madsen and Kristensen (1976))
> that uses that transformational approach does NOT abstract from rhs
> structures and is able to reconstruct the structure of a complex
> handle in terms of the corresponding structural rhs ingredients. I
> strongly feel that THIS is what ELR parsers should accomplish.

You are certainly entitled to that opinion and it is not without
foundation. Provable correctness is an important criterion, as is
having a well-defined and well-understood model. I am happy to see
that you have pursued your work. I will attempt to read your
dissertation. However, I will admit that my German is not that strong
and the last dissertation I attempted to read in German was too subtle
for my meager translation abilities, so I grapsed only a few of its
points. (Shame too, the author answered some questions that I truly
wanted to understand.)

Best regards,
-Chris

*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
3 Proctor Street voice : (508) 435-5016
Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)
------------------------------------------------------------------------------

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: regular expression operators in CF grammars

"Chris F Clark" <cfc@world.std.com>2 Jul 2002 01:13:01 -0400

"Chris F Clark" <cfc@world.std.com>
2 Jul 2002 01:13:01 -0400