Extensible grammars

"John Max Skaller" <skaller@nospam.com.au>
13 Dec 2004 02:03:25 -0500

          From comp.compilers

Related articles
Extensible grammars skaller@nospam.com.au (John Max Skaller) (2004-12-13)
Re: Extensible grammars claus.reinke@talk21.com (Claus Reinke) (2004-12-16)
Re: Extensible grammars skaller@nospam.com.au (John Max Skaller) (2004-12-17)
Re: Extensible grammars skaller@nospam.com.au (John Max Skaller) (2004-12-17)
Re: Extensible grammars Ralf.Laemmel@cwi.nl (Ralf Laemmel) (2004-12-23)
| List of all articles for this month |

From: "John Max Skaller" <skaller@nospam.com.au>
Newsgroups: comp.compilers
Date: 13 Dec 2004 02:03:25 -0500
Organization: Compilers Central
Keywords: parse, design
Posted-Date: 13 Dec 2004 02:03:25 EST

I'm interested in some preliminary reading on extensible
grammars. I have just integrated Scott McPeak's Elkhound


http://www.cs.berkeley.edu/~smcpeak/elkhound/


into my Felix programming language


http://felix.sf.net


Elkhound is a GLR parser generator. A trivial but complete example is
given below.


In order to make *extensible* grammars, a common technique is to 'cut
and paste', or use some notation which allows you to 'add productions'
for a nonterminal.


Camlp4, for example, allows you to insert new productions into a base
grammar in a fairly flexible way.


Still this is basically not very nice: it's basically just cut and
paste.


I'm considering a different idea. To extend data types and functions,
there is a technique known as open recursion. With this technique,
recursion is eliminated from your specification, and replaced by a
parameter. The recursion is fixated by binding the parameterized
whole to the parameter which completes the recursion. EG in Ocaml
using polymorphic variants:


type 'a e' = [ `X of 'a] (* open *)
type e = 'b e' as 'b (* fixate *)


type 'a e2' = ['a e' | `Y of 'a] (* open *)
type e2 = e2 = 'b e as 'b (* fixate *)


This allows extension of the open form, followed by closure to make a
recursive type. We don't normally want to extend a closed recursion,
because our new term wouldn't participate.


I'm thinking of using this technique to extend grammars, since these
are typically heavily recursive, and we almost always want the
extensions to participate in recursion.


My thinking is that this is a more 'well principled' way to extend
grammars than textual modification. So the basic idea is to add a
concept of 'nonterminal variable' and allow grammars to be
parameterised, then provide a method of extension, and one for
fixation, as in the Ocaml polymorphic variant example above.


It's hard to believe I invented this idea, so:


Are there any papers on this? Any hints or comments?


-------------------------------------------------------
include "std";


// the input string
data := "1+2+3$";


// a type for tokens
union token_t =
    | TOK_EOF
    | TOK_PLUS
    | TOK_INT of int
;


// a token stream generator
var i = 0;
fun get_token():token_t =
{
    ch := data.[i to i+1];
    ++i;
    tok :=
        match ch with
        | "$" => TOK_EOF
        | "+" => TOK_PLUS
        | "1" => TOK_INT 1
        | "2" => TOK_INT 2
        | "3" => TOK_INT 3
        endmatch
    ;
    return tok;
}


// a type for expression terms
union expr_t =
    | Integer of int
;


// a grammar for expressions
nonterm expr : expr_t =
| xx:expr TOK_PLUS y:TOK_INT =>
    match xx with
    | Integer ?i => Integer (i+y)
    endmatch


| y:TOK_INT => Integer y
;


// a parser for our example
var z : 1 + int =
    parse (the get_token) with
    | e: expr => match e with | Integer ?i => i endmatch
    endmatch
;


// print the result
match z with
| case 1 => { print "Error"; }
| case 2 (?i) => { print i; }
endmatch;
endl;


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.