Re: source conversion

Terence Parr <parrt@magelang.com>
23 Sep 1997 23:41:59 -0400

          From comp.compilers

Related articles
source conversion Waverly@DigitSW.com (Waverly Edwards) (1997-09-12)
Re: source conversion sreeni@csc.albany.edu (1997-09-15)
Re: source conversion letribble@msmail4.HAC.COM (Louis Tribble) (1997-09-15)
source conversion derek@knosof.co.uk (1997-09-15)
Re: source conversion parrt@magelang.com (Terence Parr) (1997-09-23)
Re: source conversion gduzan@gte.com (Gary Duzan) (1997-09-24)
| List of all articles for this month |
From: Terence Parr <parrt@magelang.com>
Newsgroups: comp.compilers,comp.compilers.tools.pccts
Date: 23 Sep 1997 23:41:59 -0400
Organization: MageLang Institute
References: 97-09-034 97-09-063
Keywords: tools, translator

Derek M Jones wrote:
>
> Waverly@DigitSW.com "Waverly Edwards" writes:
> >
> > Has anyone done any source conversion using SORCERER. The product
> > sounds interesting but I'm wondering if I need to be a compiler wiz in
> > order to use it.
>
> I used it many years ago. Back then it was a good product. Presumably
> it is even better now.


I am not actively supporting the C++/C version of SORCERER anymore
(although Tom Moog is doing what he can for PCCTS 1.xx), but as
someone has pointed out ANTLR 2.xx incorporates SORCERER into it. 40k
lines of C/C++ dropped to 14k lines of Java (ANTLR,DLG,SORCERER ->
ANTLR). ANTLR generalizes the notion of recognition in that it builds
the same type of recognizer for parsing char, token, or AST node
streams (albeit these are in two-dimensions). This means that you can
use full predicated-LL(k) grammars to describe your char stream
(really great for two-level parses like HTML--inside and outside
tags). ANTLR 2.xx currently generates only Java, but a C++ code
generator is planned.


A note regarding source-to-source translation. A parser generator
does not a translator generator make. All "modern" parser generators
will build trees for you, however, what do you do with trees when you
get them? Well, some systems try to hide these trees and do pure
attribute grammar translations and such but I'll stick to stuff used
commonly by commercial programmers (no flames please).


Anyway, most tools do not provide support for the traversal or
transformation of trees (unless you generate LISP ;)). If you're
lucky, your tree generator will build objects of different types
(heterogeneous tree nodes). That way you can build action() methods
or whatever for each node that knows what to do for each node and how
to traverse its children.


The best solution, of course, is to have a grammar that specifies the
structure of your trees. Asking the question "why do I need a grammar
to build my tree walkers?" is analogous to asking "why do I need a
parser generator when I can build a parser by hand?" For example, the
following grammar describes simple expression trees with INT nodes as
leaves and PLUS nodes as subtree roots.


expr : #( PLUS expr expr )
          | INT
          ;


where #(a b c) is a tree of the form:


                        a
                      / \
                    b c


Why would you write code to walk trees of this form when you can use a
grammar to formalize it, allowing for nondeterministic tree walk
detection etc...? Furthermore, a tree parser generator can also help
you do transformations, which are difficult with hand-coded tree
walkers.


I speak from experience building rather nasty translators for the Army
(as a postdoc) and for commercial clients as a consultant. SORCERER
was born after I had written my second fortran 77 translator by
hand...I said "Gawd! There has to be a better way to build tree
walkers...I'll bet I could build a tool to generate these damn
things!" Slowly but surely the idea of a tree grammar is permeating
the industry.


Oops, I'm off the subject ;)


Best regards,
Terence
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.