Re: coupling LALR with a scanner?

Paul B Mann <paul@paulbmann.com>
Tue, 13 Sep 2011 13:38:48 -0700 (PDT)

          From comp.compilers

Related articles
coupling LALR with a scanner? armelasselin@hotmail.com (Armel) (2011-07-05)
Re: coupling LALR with a scanner? armelasselin@hotmail.com (Armel) (2011-07-07)
coupling LALR with a scanner? uu3kw29sb7@snkmail.com (Karsten Nyblad) (2011-07-07)
Re: coupling LALR with a scanner? uu3kw29sb7@snkmail.com (Karsten Nyblad) (2011-07-08)
Re: coupling LALR with a scanner? armelasselin@hotmail.com (Armel) (2011-08-04)
Re: coupling LALR with a scanner? paul@paulbmann.com (Paul B Mann) (2011-09-13)
Re: coupling LALR with a scanner? armelasselin@hotmail.com (Armel) (2011-09-16)
Re: coupling LALR with a scanner? paul@paulbmann.com (Paul B Mann) (2011-09-17)
Re: coupling LALR with a scanner? armelasselin@hotmail.com (Armel) (2011-09-19)
Re: coupling LALR with a scanner? paul@paulbmann.com (Paul B Mann) (2011-09-19)
Re: coupling LALR with a scanner? armelasselin@hotmail.com (Armel) (2011-09-20)
Re: coupling LALR with a scanner? cdodd@acm.org (Chris Dodd) (2011-09-23)
[4 later articles]
| List of all articles for this month |

From: Paul B Mann <paul@paulbmann.com>
Newsgroups: comp.compilers
Date: Tue, 13 Sep 2011 13:38:48 -0700 (PDT)
Organization: Compilers Central
References: 11-07-013 11-07-015 11-07-018 11-08-004
Keywords: LALR, parse
Posted-Date: 16 Sep 2011 00:51:21 EDT

/* Test1 grammar (like the one above). */


      G -> A <eof>
      A -> a B c
            -> a B d
      B -> b


Test1 grammar is LR(0) and requires NO lookahead symbols.
By default, b always reduces to B.


/* Test2 grammar. */


      G -> A <eof>
      A -> a B1 c
            -> a B2 d
      B1 -> b
      B2 -> b


Test2 grammar is LALR(1) and probably SLR(1) also. The parser
requires one symbol of lookahead to know which reduction to make (B1
or B2).


/* Test3 grammar. */


      G -> A <eof>
      A -> a B1 c
            -> a B2 d
            -> a B1 d
            -> a B2 c
      B1 -> b
      B2 -> b


Test3 grammar is NOT LALR(1), but it is LR(1). The parser requires
one symbol of lookahead to know which reduction to make (B1 or B2).


Test3 is a highly contrived situation, not usually seen in real-world
computer languages (as far as I know). Test3 could be written as
Test2.


It seems that LALR(1) parsers are more powerful than some people
realize. LR(1) parser tables can be huge. A COBOL-85 grammar created
an LR(1) parser with over 2,000,000 states, whereas a IELR(1) parser
for that grammar created about 1668 states. Bison can create IELR(1)
parsers. HYACC can also create IELR(1) parsers.


I don't understand why you want to have a different scanner for each
state. The parser can easily make the decision, whether a token is
valid. In fact, an LALR parser already has this information in the
parser tables, so why make a simple situation complicated?


Also in case of a syntax error, the parser can tell the user what
tokens were expected at the error point. So there is no need for
different lexers, unless the syntax of the tokens changes from one
state to the next and this could be confusing to the the user of your
language.


Paul B Mann


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.