Related articles |
---|
Parsing implicit operators with recursive descent? johann@myrkraverk.com (Johann 'Myrkraverk' Oskarsson) (2009-02-06) |
Re: Parsing implicit operators with recursive descent? mailbox@dmitry-kazakov.de (Dmitry A. Kazakov) (2009-02-07) |
Re: Parsing implicit operators with recursive descent? armelasselin@hotmail.com (Armel) (2009-02-07) |
Re: Parsing implicit operators with recursive descent? torbenm@pc-003.diku.dk (2009-02-09) |
Re: Parsing implicit operators with recursive descent? barry.j.kelly@gmail.com (Barry Kelly) (2009-02-12) |
Re: Parsing implicit operators with recursive descent? johann@myrkraverk.com (Johann 'Myrkraverk' Oskarsson) (2010-01-30) |
Re: Parsing implicit operators with recursive descent? kkylheku@gmail.com (Kaz Kylheku) (2010-02-01) |
From: | torbenm@pc-003.diku.dk (Torben =?iso-8859-1?Q?=C6gidius?= Mogensen) |
Newsgroups: | comp.compilers |
Date: | Mon, 09 Feb 2009 12:26:02 +0100 |
Organization: | Department of Computer Science, University of Copenhagen |
References: | 09-02-013 |
Keywords: | parse, LL(1) |
Posted-Date: | 10 Feb 2009 09:46:58 EST |
"Johann 'Myrkraverk' Oskarsson" <johann@myrkraverk.com> writes:
> Is it possible to parse implicit operators, like the regular
> expression concatenation operator, with a recursive descent parser?
Yes, you can do it mostly by standard grammar-rewriting techniques.
Let us take regular expressions as an example. We start with the
grammar:
R -> R | R
R -> R R
R -> R *
R -> a
Note that the | above is a terminal. a stands for any alphabet symbol.
We start by eliminating ambiguity. We choose (as normal) | to bind
weaker than concatenation which binds weaker than Kleene-star. We
also choose | and concatenation to be right-associative. They are
semantically fully associative, so we could choose any. But we get
less left-recursion to eliminate later by choosing them to be
right-assciative.
R -> C | R
R -> C
C -> S C
C -> S
S -> S *
S -> a
We now do left factorisation and left-recursion elimination:
R -> C R'
R' -> | R
R' ->
C -> S C'
C' -> C
C' ->
S -> a S'
S' -> * S'
S' ->
To see if this is LL(1), we need to calculate FIRST and FOLLOW.
Below, I have for each production added the FIRST set:
R -> C R' {a}
R' -> | R {|}
R' -> {}
C -> S C' {a}
C' -> C {a}
C' -> {}
S -> a S' {a}
S' -> * S' {*}
S' -> {}
The FOLLOW sets are:
FOLLOW(R) = {$}
FOLLOW(R') = {$}
FOLLOW(C) = {$,|}
FOLLOW(C') = {$,|}
FOLLOW(S) = {$,|,a}
FOLLOW(S') = {$,|,a}
This leads to the following parse table:
| a * | $
--+------------------------------
R | R->CR'
R'| R'->|R R'->
C | C->SC'
C'| C'->C C'-> C'->
S | S->aS'
S'| S'-> S'->*S' S'-> S'->
which has no conflicts, so the grammar is LL(1).
The parse table can easily be rewritten to a recursive descent parser.
Torben
Return to the
comp.compilers page.
Search the
comp.compilers archives again.