Re: LR-regular parsers for dummies ?

Chris F Clark <world!>
31 Oct 1999 01:19:14 -0400

          From comp.compilers

Related articles
LR-regular parsers for dummies ? (1999-10-28)
Re: LR-regular parsers for dummies ? (Ziemowit Laski) (1999-10-29)
Re: LR-regular parsers for dummies ? (Chris Dodd) (1999-10-29)
Re: LR-regular parsers for dummies ? (1999-10-31)
Re: LR-regular parsers for dummies ? world! (Chris F Clark) (1999-10-31)
Re: LR-regular parsers for dummies ? world! (Chris F Clark) (1999-10-31)
| List of all articles for this month |

From: Chris F Clark <world!>
Newsgroups: comp.compilers,comp.theory
Date: 31 Oct 1999 01:19:14 -0400
Organization: The World Public Access UNIX, Brookline, MA
Distribution: inet
References: 99-10-142 99-10-160 99-10-171
Keywords: parse, LR(1)

Dirk van Deun wrote:
D> I have started reading a 1971 paper ``LR-regular grammars -- an
D> Extension of LR(k) Grammars'', not once but several times.
D> Essentially, I drown in it . . .
D> Does anyone remember this concept ?

The consensus is as Ziemowit Laski <> wrote:

Z> You start out with the typical LR(k) grammars amenable to
Z> directional parsing; then, for lookahead, instead of allowing only k
Z> tokens, you allow arbitrary *regular expressions* of tokens, and
Z> hence unbounded lookahead. These regular expressions are handled by
Z> a separate finite state automaton.

Chris Dodd <> added:

C> The method works by constructing an RE for each of the conflicting
C> actions, and then building a DFA that recognizes them
C> simultanueously; whichever RE matches is the action to take.

In a follow-up email, the question was further continued:

D> One obvious question is: I assume that a parser generator using this
D> technique would have to generate these regular expressions itself.
D> If so, how would it do that ? (If the answer is in the paper, I do
D> not recognize it.) If not, isn't that quite impractical for the
D> programmer ?

I just re-skimmed the paper as printed in "Journal of Computer and
System Sciences 7 (1971)", it's on pages 66-96.

Section 1 introduces the regular expressions (and describes a right
partition), which is a set of disjoint regular expressions that can be
used for the look-ahead.

Section 5 attempts to explain one method to construct the regular
expressions. It works by examining each rule (production p = A:
gamma;) and defines languages Lyes(p) and Lno(p), where Lyes(p) is the
language where p is used in the derivation and Lno(p) is the language
where p is not used in the derivation. The languages (which are CFLs)
induce regular expressions (Xyes(p) and Xno(p)) which are the tokens
at which the lookahead must be distinguished. That in turn creates
two new CFLs Kyes(p) and Kno(p), which are the languages where the
distinguishing tokens have appeared or not. The question then becomes
finding regular lookahead sets for the Kyes(p) and Kno(p) languages (which
tell you whether to reduce with p or not).

Section 6 says: "whether or not there exists a regular set Mp
separating then (Kyes(p) and Kno(p)) . . . appears to be undecidable
[and was proven so by Ogden as mentioned in a footnote]. . . .
However, we can develop some techinques for obtaining [them]."

I take this to mean that either the user has to supply the separating
regular expressions or that the generator will apply some hueristics
to find the separating regular expressions, but if it does not find
them, that does not mean they don't exist.

I would be curious if anyone who has access to Visual Parse++ (which
at one point claimed to be an LR-regular parser generator) can
determine what it does? [Note, this is a separate question from the
"Natural Language" parsing capability of that tool, which appears to
be GLR (Tomita/Lang) parsing.]

Hope this helps,

Chris Clark Internet :
Compiler Resources, Inc. CompuServe : 74252,1375
3 Proctor Street voice : (508) 435-5016
Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)
Web Site in Progress: Web Site :

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.