Re: LL(1) vs LALR(1) parsers

mparks@oz.net (Michael Parkes)
12 Dec 1995 14:35:04 -0500

From comp.compilers

Related articles
[17 earlier articles]
Re: LL(1) vs LALR(1) parsers CSPT@giraffe.ru.ac.za (Pat Terry) (1995-11-30)
Re: LL(1) vs LALR(1) parsers gvcormac@plg.uwaterloo.ca (Gord Cormack) (1995-12-01)
Re: LL(1) vs LALR(1) parsers bridges@cs.arizona.edu (1995-12-01)
Re: LL(1) vs LALR(1) parsers mparks@oz.net (1995-12-09)
Re: LL(1) vs LALR(1) parsers maatwerk@euronet.nl (1995-12-09)
Re: LL(1) vs LALR(1) parsers sperber@informatik.uni-tuebingen.de (Michael Sperber [Mr. Preprocessor]) (1995-12-09)
*Re: LL(1) vs LALR(1) parsers mparks@oz.net* (1995-12-12)**
Re: LL(1) vs LALR(1) parsers solution@gate.net (1995-12-16)
Better tools than lex & yacc(was Re: LL(1) vs LALR(1) parsers) dkulkarn@aristotle.helios.nd.edu (1995-12-17)
Re: LL(1) vs LALR(1) parsers sb@metis.no (1995-12-17)
Re: LL(1) vs LALR(1) parsers scooter@mccabe.com (Scott Stanchfield) (1995-12-18)
Re: LL(1) vs LALR(1) parsers G.A.Tijssen@eco.RUG.NL (Gert A. Tijssen) (1995-12-19)

| List of all articles for this month |

From:	mparks@oz.net (Michael Parkes)
Newsgroups:	comp.compilers
Date:	12 Dec 1995 14:35:04 -0500
Organization:	Sense Networking Seattle (www.oz.net)
References:	95-11-051 95-11-234 95-12-062
Keywords:	parse, LALR, LL(1), errors

  maatwerk@euronet.nl says...
>Most errors are NOT detected in the parser, but afterwards. Errors
>detected by the parser are of the kind 'missing something" and in
>those cases there is no need for clever recovery.

This is the case if you just want to skip to the next statement. However, a
better compiler will try to skip the faulty part of the clause and continue.
This is very tricky. Why bother - because if the error is in a declaration
you will prevent lots of follow on errors.

> Therefore, in my
>opinion error recovery is no argument for selection of the
>right type of grammar.

If you had tried some of the latest tools you would understand why they are
related and why a well thought-out tool often requires one type of grammar or
another.

>For those in favor of handwritten parsers: I also like them
>because they are so easy to understand and fast. But first
>checking a grammer with yacc and then retyping it as code is
>double work. Wouldn't everyone be happy with a simple generator
>that checks the unattributed grammar and then generates a
>recursive-descent parser that you can fill in by hand? Does
>this exist?

Handwritten parsers a nice if you have lots of manpower and you enjoy messing
around. However, if you don't then some of the better tools are much better
and give a very nice result. This is because the tools can be bothered to
calculate what most people will not. Such as the path to the nearest recovery
point. The minimum change to the semantics to recover from an error. If you
add new clauses they will automatically rework the error recovery and make it
correct.

Additionally, lots of people here refer to 'lex' and 'yacc'. These tools were
developed around 1976 ! and are very old. Most modern tools support
intergrated lexical, syntax and semantic analysis. Some even support code
generation. Hence, even mentioning 'lex' and 'yacc' implies that better tools
have not been tried.

Example: Some time ago I wrote a small (but complete) Pascal front end in
about 3 man months using a tool. The code did not include a single line of
'C' or 'C++' but rather was a specification of the language. If I hadn't
messed around so much I could had probably done it in 2/3 of the time. I was
very happy with the result and best of all the maintainance was considerably
easier than an equivalent hand coded model. In fact while I am here I will
post a small fragment.

              /************************************************************/
              /************************************************************/
              /* */
              /* The declaration of Pascal labels. */
              /* */
              /* The declaration of labels in Pascal must be done at */
              /* the head of a block. */
              /* */
              /************************************************************/
              /************************************************************/

      Labels( INOUT env ) =
            [
                  [[ ERROR,None,500,MESSAGE(2000) ] /* Deals with errors. */
                        IN Label, /* Accepts the token 'LABEL'. */
                              (
                                    LabelList( INOUT env )
                              ),
                        IN SemiColon /* Accepts the token ';'. */
                  ]
            ];

      LabelList( INOUT env ) =
            [[ ERROR,None,50,MESSAGE(2010) ] /* If error skip current label. */
                    LabelValidate( INOUT env ),
                        {
                              IN Comma, /* Accepts the token ','. */
                                    (
                                          LabelValidate( INOUT env )
                                    )
                        }
            ];

      LabelValidate( INOUT env ) =
            [[ ERROR,1,None,MESSAGE(2020) ] /* Put new label in symbol */
                  LabelDuplicate /* table if can't complain. */
                        (
                              OUT [],
                              IN env.locallabels[UNION],
                              IN env.nonlocallabels[OVERRIDE]
                        )
            ];

      LabelDuplicate( OUT gotolabels,IN gotolabels,IN gotolabels ) =
            [[ ERROR,None,20,MESSAGE(2030) ] /* Moan about syntax of */
                  LabelName /* the label if needed. */
                        (
                              IN gotolabels
                        )
            ];

            /* Must allow jumping out of functions. */
            /* Hence must add level numbers to labels */
            /* to allow the stack to be reset. */

      LabelName( IN [ label -> { NoFlags,UNIQUENUMBER,label } ] ) =
            ( IN Integer( IN label ) ); /* Accept label token. */

The above code deals with Pascal labels in all respects except lexically.
However, the lexical stuff is simple such as:

Label = NOCASE "LABEL";
SemiColon = ";";

The tool that was used would report any error that was dectected and
automatically recover. Some examples being.

'file.pas', line 4, character 27: Error - Badly formed label.
Expected 'Integer' but found 'Identifier' ('v1').

or

'file.pas', line 4, character 27: Error - Duplicate label '6' (ignored).

All this is automatic no extra coding required. The new label symbols are
added to the symbol table for later processing. So the example is truely
complete.

This particular tool supports all the usual symbol table features such as
scoping and so on. Hence, what I am really trying to say is that it is not
quite 'lex' and 'yacc' is it. This is just one of the really great tools out
there. Go try them !.

Regards,

Mike
--

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: LL(1) vs LALR(1) parsers

mparks@oz.net (Michael Parkes)12 Dec 1995 14:35:04 -0500

mparks@oz.net (Michael Parkes)
12 Dec 1995 14:35:04 -0500