Related articles |
---|
Parsing implicit operators with recursive descent? johann@myrkraverk.com (Johann 'Myrkraverk' Oskarsson) (2009-02-06) |
Re: Parsing implicit operators with recursive descent? mailbox@dmitry-kazakov.de (Dmitry A. Kazakov) (2009-02-07) |
Re: Parsing implicit operators with recursive descent? armelasselin@hotmail.com (Armel) (2009-02-07) |
Re: Parsing implicit operators with recursive descent? torbenm@pc-003.diku.dk (2009-02-09) |
Re: Parsing implicit operators with recursive descent? barry.j.kelly@gmail.com (Barry Kelly) (2009-02-12) |
Re: Parsing implicit operators with recursive descent? johann@myrkraverk.com (Johann 'Myrkraverk' Oskarsson) (2010-01-30) |
Re: Parsing implicit operators with recursive descent? kkylheku@gmail.com (Kaz Kylheku) (2010-02-01) |
From: | "Johann 'Myrkraverk' Oskarsson" <johann@myrkraverk.com> |
Newsgroups: | comp.compilers |
Date: | Sat, 30 Jan 2010 17:19:27 +0000 |
Organization: | Compilers Central |
References: | 09-02-013 09-02-022 |
Keywords: | parse |
Posted-Date: | 31 Jan 2010 01:13:29 EST |
A year ago, torbenm@pc-003.diku.dk (Torben Fgidius Mogensen) wrote:
> "Johann 'Myrkraverk' Oskarsson" <johann@myrkraverk.com> writes:
>
>> Is it possible to parse implicit operators, like the regular
>> expression concatenation operator, with a recursive descent parser?
>
> Yes, you can do it mostly by standard grammar-rewriting techniques.
>
> Let us take regular expressions as an example. We start with the
> grammar:
>
>
> R -> R | R
> R -> R R
> R -> R *
> R -> a
>
> Note that the | above is a terminal. a stands for any alphabet
> symbol.
>
> We start by eliminating ambiguity. We choose (as normal) | to bind
> weaker than concatenation which binds weaker than Kleene-star. We
> also choose | and concatenation to be right-associative. They are
> semantically fully associative, so we could choose any. But we get
> less left-recursion to eliminate later by choosing them to be
> right-assciative.
>
> R -> C | R
> R -> C
> C -> S C
> C -> S
> S -> S *
> S -> a
>
> We now do left factorisation and left-recursion elimination:
>
> R -> C R'
> R' -> | R
> R' ->
> C -> S C'
> C' -> C
> C' ->
> S -> a S'
> S' -> * S'
> S' ->
>
> To see if this is LL(1), we need to calculate FIRST and FOLLOW.
> Below, I have for each production added the FIRST set:
>
>
> R -> C R' {a}
> R' -> | R {|}
> R' -> {}
> C -> S C' {a}
> C' -> C {a}
> C' -> {}
> S -> a S' {a}
> S' -> * S' {*}
> S' -> {}
>
> The FOLLOW sets are:
>
> FOLLOW(R) = {$}
> FOLLOW(R') = {$}
> FOLLOW(C) = {$,|}
> FOLLOW(C') = {$,|}
> FOLLOW(S) = {$,|,a}
> FOLLOW(S') = {$,|,a}
>
> This leads to the following parse table:
>
> | a * | $
> --+------------------------------
> R | R->CR'
> R'| R'->|R R'->
> C | C->SC'
> C'| C'->C C'-> C'->
> S | S->aS'
> S'| S'-> S'->*S' S'-> S'->
>
> which has no conflicts, so the grammar is LL(1).
>
> The parse table can easily be rewritten to a recursive descent
> parser.
It has been a year since I was working on this toy project, and now
that I'm digging it up again, I realize I don't understand the parse
table above.
In particular, what does "S'-> " mean? What does it mean when the
parse table entry is empty? Like when R' hits "a".
Perhaps it's been too long since I was reading the dragon book. I
have the 1988 version, according to the copyright section.
Johann -- in hope for clarifications.
P.S. And thanks everyone who responded to my original post, though you
may not remember.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.