Re: RFC - VBS Grammar

"Joachim Durchholz" <joachim_d@gmx.de>
31 Dec 2000 03:01:25 -0500

          From comp.compilers

Related articles
RFC - VBS Grammar larlew@home.com (Larry Lewis) (2000-12-23)
Re: RFC - VBS Grammar joachim_d@gmx.de (Joachim Durchholz) (2000-12-31)
Re: RFC - VBS Grammar snicol@apk.net (Scott Nicol) (2000-12-31)
Re: RFC - VBS Grammar snicol@apk.net (Scott Nicol) (2001-01-04)
Re: RFC - VBS Grammar Arargh@Enteract.com (Arargh!) (2001-01-04)
| List of all articles for this month |

From: "Joachim Durchholz" <joachim_d@gmx.de>
Newsgroups: comp.compilers
Date: 31 Dec 2000 03:01:25 -0500
Organization: Compilers Central
References: 00-12-104
Keywords: Basic, parse
Posted-Date: 31 Dec 2000 03:01:25 EST

Larry Lewis <larlew@home.com> wrote:
> I have attached a simple grammar(lex/yacc) for VB script. There are
still a
> few things missing but I am wondering if I'm on the right track here.
>
> Does anyone care to review and comment?


Well, I once started to do something with a MS Basic (QuickBasic if
anybody remembers), did a full analysis of the IF statement, and decided
to scrap the project ;)
What little I can say on the topic from this experience, here it is:


1. Expression parsing
You have a specific rule for each precedence level. This is a valid
approach, but it will generate a much larger tree than you might expect:
an assignment "set a = 42" will look like this:
            =
        / \
    set assignment_expression
                  |
            logical_imp_expression
                  |
            logical_eqv_expression
                  |
            logical_xor_expression
                  |
            logical_or_expression
                  |
            logical_and_expression
                  |
            is_expression
                  |
            equality_expression
                  |
            relational_expression
                  |
            concatenation_expression
                  |
            additive_expression
                  |
            modulo_expression
                  |
            intdiv_expression
                  |
            multiplicative_expression
                  |
            exponent_expression
                  |
            uminus_expression
                  |
            postfix_expression
                  |
            primary_expression
                  |
            CONSTANT (value = 42)


meaning an overhead of 19 tree nodes for every primary_expression in
the program (minus the number of operators, because then the tree node
is justified).


Ways around this:
1) Don't generate tree nodes for productions like "postfix_expression:
primary_expression". (Simple but tedious. Well, generating trees in
yacc/bison is tedious anyway.)
2) Use operator precedence. Just write a single (ambiguous) production
    expression: primary_expression
    | expression operator primary_expression.
and add disambiguating precedence and associativity declarations to the
tokens in "operator" (the precedence/associativity mechanism of
yacc/bison is built specifically for this case, so it's OK to use it
even if it may obscure errors in the grammar).


2. IF statement
QuickBasic had a THEN-less form of IF, à la
    IF expression statement [ELSE statement]
without an END IF. This is probably deprecated in VB and thus
undocumented, but if VB still supports this, you may have to support it
as well (this depends on the VB programs you want to parse).
I don't remember the full rule anymore, but it was somewhat complicated
and not fully documented even in QB. It took me about an afternoon to
enumerate, experiment with, and document all the special cases.
I decided to scrap my project when I was done with IF and realized that
I'd have to analyze all the other statements - WHILE, FOR, SWITCH, etc.
... - an alternative might have been to parse just the documented syntax
and simply declare an error on undocumented stuff.


Regards,
Joachim


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.