Re: Parsing implicit operators with recursive descent?

torbenm@pc-003.diku.dk (Torben =?iso-8859-1?Q?=C6gidius?= Mogensen)
Mon, 09 Feb 2009 12:26:02 +0100

From comp.compilers

Related articles
Parsing implicit operators with recursive descent? johann@myrkraverk.com (Johann 'Myrkraverk' Oskarsson) (2009-02-06)
Re: Parsing implicit operators with recursive descent? mailbox@dmitry-kazakov.de (Dmitry A. Kazakov) (2009-02-07)
Re: Parsing implicit operators with recursive descent? armelasselin@hotmail.com (Armel) (2009-02-07)
*Re: Parsing implicit operators with recursive descent? torbenm@pc-003.diku.dk* (2009-02-09)**
Re: Parsing implicit operators with recursive descent? barry.j.kelly@gmail.com (Barry Kelly) (2009-02-12)
Re: Parsing implicit operators with recursive descent? johann@myrkraverk.com (Johann 'Myrkraverk' Oskarsson) (2010-01-30)
Re: Parsing implicit operators with recursive descent? kkylheku@gmail.com (Kaz Kylheku) (2010-02-01)

| List of all articles for this month |

From:	torbenm@pc-003.diku.dk (Torben =?iso-8859-1?Q?=C6gidius?= Mogensen)
Newsgroups:	comp.compilers
Date:	Mon, 09 Feb 2009 12:26:02 +0100
Organization:	Department of Computer Science, University of Copenhagen
References:	09-02-013
Keywords:	parse, LL(1)
Posted-Date:	10 Feb 2009 09:46:58 EST

"Johann 'Myrkraverk' Oskarsson" <johann@myrkraverk.com> writes:

> Is it possible to parse implicit operators, like the regular
> expression concatenation operator, with a recursive descent parser?

Yes, you can do it mostly by standard grammar-rewriting techniques.

Let us take regular expressions as an example. We start with the
grammar:

    R -> R | R
    R -> R R
    R -> R *
    R -> a

Note that the | above is a terminal. a stands for any alphabet symbol.

We start by eliminating ambiguity. We choose (as normal) | to bind
weaker than concatenation which binds weaker than Kleene-star. We
also choose | and concatenation to be right-associative. They are
semantically fully associative, so we could choose any. But we get
less left-recursion to eliminate later by choosing them to be
right-assciative.

    R -> C | R
    R -> C
    C -> S C
    C -> S
    S -> S *
    S -> a

We now do left factorisation and left-recursion elimination:

    R -> C R'
    R' -> | R
    R' ->
    C -> S C'
    C' -> C
    C' ->
    S -> a S'
    S' -> * S'
    S' ->

To see if this is LL(1), we need to calculate FIRST and FOLLOW.
Below, I have for each production added the FIRST set:

    R -> C R' {a}
    R' -> | R {|}
    R' -> {}
    C -> S C' {a}
    C' -> C {a}
    C' -> {}
    S -> a S' {a}
    S' -> * S' {*}
    S' -> {}

The FOLLOW sets are:

  FOLLOW(R) = {$}
  FOLLOW(R') = {$}
  FOLLOW(C) = {$,|}
  FOLLOW(C') = {$,|}
  FOLLOW(S) = {$,|,a}
  FOLLOW(S') = {$,|,a}

This leads to the following parse table:

      | a * | $
  --+------------------------------
  R | R->CR'
  R'| R'->|R R'->
  C | C->SC'
  C'| C'->C C'-> C'->
  S | S->aS'
  S'| S'-> S'->*S' S'-> S'->

which has no conflicts, so the grammar is LL(1).

The parse table can easily be rewritten to a recursive descent parser.

Torben

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Parsing implicit operators with recursive descent?

torbenm@pc-003.diku.dk (Torben =?iso-8859-1?Q?=C6gidius?= Mogensen)Mon, 09 Feb 2009 12:26:02 +0100

torbenm@pc-003.diku.dk (Torben =?iso-8859-1?Q?=C6gidius?= Mogensen)
Mon, 09 Feb 2009 12:26:02 +0100