Re: left- or right-recursion in LALR-grammars?

Markus Mottl <mottl@miss.wu-wien.ac.at>
4 Mar 1999 12:16:30 -0500

          From comp.compilers

Related articles
left- or right-recursion in LALR-grammars? mottl@miss.wu-wien.ac.at (Markus Mottl) (1999-02-26)
Re: left- or right-recursion in LALR-grammars? torbenm@diku.dk (Torben Mogensen) (1999-03-02)
Re: left- or right-recursion in LALR-grammars? mottl@miss.wu-wien.ac.at (Markus Mottl) (1999-03-04)
| List of all articles for this month |
From: Markus Mottl <mottl@miss.wu-wien.ac.at>
Newsgroups: comp.compilers
Date: 4 Mar 1999 12:16:30 -0500
Organization: University of Economics and Business Administration, Vienna, Austria
References: 99-02-127 99-03-004
Keywords: LALR, parse

Torben Mogensen <torbenm@diku.dk> wrote:
> Markus Mottl wrote:
> My choice of left or right recursion depends on two factors:


> 1) (like John) whatever makes it easier to do attribution,
> e.g. building a list of statements (right recursion).
> If you work in a functional language, it is, however, fairly easy
> to get around this using higher-order functions for attributes.


Since I use OCAML, I apply a similar technique for attribution in a
functional framework:


In order to have reentrant compilers I leave all impure stuff outside
the scanner/parser-specification (ocamllex & ocamlyacc). For maintaining
the semantics (which might include impure properties - very often the
case) I use objects. To manipulate the state of the objects I generate
streams of curried functions that take as argument an object. I have
found out that this is *very* efficient, *very* convenient (readable)
and very easy to maintain/extend!


E.g. ((incomplete) parser spec. I leave out the lexer - unimportant for
            demonstration).


    main
        : A_TOKEN { [< 'add_token $1 >] }
        | main A_LAST_TOKEN { [< $1; 'clean_up >] }


and the semantics could look something like this:


    class transformer = object (self)
        method apply_strm = Stream.iter (fun f -> f self)
    end


    class semantic_object = object
        inherit transformer
        method add_token x = (* does something with token *)
        method clean_up = (* does some cleaning-up *)
    end


    let add_token x obj = obj#add_token x
    let clean_up obj = obj#add_token x


Finally, processing is started with something like:


    let main () =
        let lexbuf = Lexing.from_channel stdin in
        let result = Parser.main Scanner.start lexbuf in
        let sem_obj = new semantic_object in sem_obj#apply_strm result


    let _ = Printexc.catch main ()


As one can see, the use of streams of curried functions is quite
elegant (syntax and semantics are cleanly parted) and readable - and
even of astonishing efficiency: in my tests much faster than algebraic
data types, because there is no need for explicit pattern matching.


The extra effort for the programmer (writing a function for each method
in the class that controls semantics) is neglectible. In the case of
algebraic data types he would have to write "match"-cases, anyway,
which is definitely not much less work (more?).


The "transformer"-class can easily be extended with methods that use other
data types for input (e.g. lists instead of streams (which are actually
lazy lists)). But streams are without doubt the best means here - you
can put them together (e.g. via concatenation) without losing efficiency,
whereas lists are not so flexible in this respect.


I have come across this trick only a short time ago, but writing parsers
is really fun now!


Best regards,
Markus
--
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.