Visual Parse++ and Semantic-Reliant Grammars

qjackson@direct.ca
7 Dec 1996 23:14:32 -0500

          From comp.compilers

Related articles
Visual Parse++ and Semantic-Reliant Grammars qjackson@direct.ca (1996-12-07)
Syntex tree of GNAT 3.05 (ADA) s605@nudt.edu.cn (s605) (1996-12-15)
| List of all articles for this month |
From: qjackson@direct.ca
Newsgroups: comp.compilers
Date: 7 Dec 1996 23:14:32 -0500
Organization: Compilers Central
Keywords: parse, C++

Hello!


An index search at the comp.compilers homepage on "Visual Parse++"
turned up some interesting comments.


After purchasing the tool for some $700.00 CDN, I developed an HTML
parser for a protype at work, was asked to consult on a proprietary
grammar being developed at a client site with VP++, and I have just
completed the design of a full experimental language grammar (full
language spec took about 10 hours of work, compared to weeks using lex
and yacc).


I would therefore like to both respond to some of the criticism I've
seen against VP++, and offer some insight into its actual usefulness
as a compiler-compiler tool.


First, I will comment on form....


VP++ accepts all LALR(k) languages, which means that it is possible to
design grammars that are not portable to what SandStone calls "legacy"
tools. If one's goal is to write grammars that are understood by
traditionalists and are acceptable to traditional tools, one must be
careful to develop sufficiently traditional grammars.


In my own use, I have strived to keep my grammars LALR(1) not so much
for the back portability, but for the conceptual portability. After
all, grammars must be maintained by human beings, and human beings may
or may not be used to an LALR(k) environment. In a productivity
pinch, however, the ability to have SandStone's "InfiLook" technology
deal with shift-reduce conflicts by k>1 lookaheads may very well save
hundreds or even thoustands of dollars of time skilled labor.


Also in the "form" category is VP++'s use of an expression stack.
Nested "C" comments become as easy to parse as:


        %expression Main


        '/\*' %ignore, %push InMultiLineComment;


        %expression InMultiLineComment


        '.' %ignore;
        '\n' %ignore;
        '/\*' %push InMultiLineComment;
        '\*/' %ignore, %pop;


Indeed, using the stack expression system, it becomes entirely
possible to parse inline assembler directly within the context of a
host language.


Another note about form... VP++ grammars are entirely language
independent. I notice in comp.compilers several comments that the
C/C++ grammar derived from Roskind's grammar does not allow for
sufficient handling of id/typedef resolution in the grammar itself.


This is a point of contention -- possible fodder for the guns of a
Religious War -- but I will dare to comment.


Any implementation-language-independent grammar that describes typing
will necessarily fail to provide the necessary expressiveness to catch
typing errors at parse- time. Or at least current language
independent tools will fail to do so. This does _not_ mean, however,
that these tools cannot be used to express typing expectations and
provide a hook for the host-language to do type checking in reduction
code.


Consider the following simple situation:


        structure -> /* a while bunch of possible stuff */;


        for_block -> for_head structure for_tail;


        if_head -> 'FOR' index_id '=' expression 'TO' expression;
        if_tail -> 'NEXT' index_id;


        index_id -> 'ID';


The above grammar describes, for instance:


        FOR a = 1 TO 10
                ' blah blah
        NEXT a


In BASIC, the above structure requires several checks beyond simple
syntax:


1. if_head.index_id must be non-const and numeric


2. for_block.for_head.index_id and for_block.for_tail.index_id
must
            refer to the same variable


It would have been possible to express if_head thus:


      if_head -> 'FOR' 'ID' '=' expression 'TO' expression;


Note, however, that by introducing anther reduction, "index_id", we
have given the reduction function of the host language a 'hook' where
it can step in with table checks that verify the nature of index_id.
The reduction for_block can do a similar check to assure that the
index_id of the 'FOR' is the same variable referred to by 'NEXT'.


It is also noteworthy that


        'FOR' index_id '=' expression 'TO' expression


is more self-documenting than


        'FOR' 'ID' '=' ....


because it expresses, at the level of the grammar, that not just any
old 'ID' will do (even if VP++ itself doesn't perform any type
checking).


I suppose the following would be a nice feature in any compiler-
compiler grammar:


        index_id_decl -> 'DECLARE' 'ID' : index_id 'AS' 'INTEGER';


        for_head -> 'FOR' 'ID' ? index_id '=' expr 'TO' expr;


In the above example, the attribute "index_id" is bound to whatever
lexeme 'ID' happens to represent. In the for_head production, the
'ID' lexeme must have previously been bound to the attribute
'index_id' for the production to succeed.


An extension such as the above keeps a grammar host-language-
independent, while still allowing many of the types of checks that are
currently done in code to be done at parse time. [My own pattern
matching tool, LPM, which I will be releasing as source near the end
of December of this year, has just such an attribute binding
capability, but is not otherwise based on BNF-style specifications.]


Such a hypothetical extension aside, however, the best place for
semantics is perhaps still the host-language reduction function, where
table-lookup algorithms and mangling are left to the implementation,
rather than the specificaton.


The long and the short of my opinion on this matter is that the power
of SandStone's Visual Parse++ tool rests not in its design (for it is
very well designed, indeed!) but in the design paradigms used by those
who employ VP++ as their parsing tool. Any sufficiently expressive
grammar tool can be used to express insufficiently comprehensive
grammars.


Visual Parse++ is just such a sufficiently expressive tool, which also
means that it can be (and has been) used to express even
non-syntactical relationships such as object typing.


Cheers,


Quinn Tyler Jackson,
Software Developer and
Happy VP++ Customer












--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.