Re: C Compiler in C++ (Alex Colvin)
17 May 2002 00:21:33 -0400

          From comp.compilers

Related articles
C Compiler in C++ (2002-05-08)
Re: C Compiler in C++ (2002-05-12)
Re: C Compiler in C++ (Diego Novillo) (2002-05-12)
Re: C Compiler in C++ (2002-05-12)
Re: C Compiler in C++ (2002-05-12)
Re: C Compiler in C++ (Rodney M. Bates) (2002-05-13)
Re: C Compiler in C++ (Lex Spoon) (2002-05-13)
Re: C Compiler in C++ (2002-05-17)
Re: C Compiler in C++ (2002-05-17)
Re: C Compiler in C++ (2002-05-17)
Re: C Compiler in C++ (2002-05-17)
Re: C Compiler in C++ (Joachim Durchholz) (2002-05-23)
Re: C Compiler in C++ (Lars Duening) (2002-06-07)
| List of all articles for this month |

From: (Alex Colvin)
Newsgroups: comp.compilers
Date: 17 May 2002 00:21:33 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 02-05-039
Keywords: C, OOP, design
Posted-Date: 17 May 2002 00:21:33 EDT

>Dear colleagues -
>I am writing a complete C compiler in C++ as an academic exercise and
>I need an advice related to the design of parse tree data structure
>for C declarations.

>I organized the parse tree so that class PTN (Parse tree node) is a
>super-class for all node classes. Than, I derive nodes such as
>PTNBinaryOperation, PTNAssignment, PTNIf, PTNSequence (sequence of
>statements), etc. Please note that in this implementation a node can
>have 0..n children.

For an alternative approach, see

That compiler was built around treewalk classes that implemented
passes and subpasses. The base class just called Visit() at each node,
and provided a Walk() method to traverse subnodes. Derived classes
collected data and rewrote the program tree.

>However, I have no idea how to go about representing C variable
>declarations in the parse tree. The problem is that C grammar itself
>is very loose on rules for variable declarations. For example: extern
>int const unsigned volatile static foo; is syntactically correct, but
>semantically doesn't make sense. Also, int const const const const
>usigned bar; is both syntactically and semantically correct, but
>should emit a warning about multiple const qualifiers.

Since I wanted a simple, regularintermediate language, I just
represented types by more tree links threaded through the abstract
syntax tree. The list ran from a the name in a declaration back
through the declarators (*, [], (), etc) to the base type (int
long). For good measure I linked variable references to their
declarations. There's no new data structure, but there is an extra
link in every tree node.

Of course, type operations become messy, but I had this base class
that walks types...

My theory was that method lookup is powerful, and I wanted to use it
for the messy work, which wasn't the program representation but the
program translation. Of course that left me with lots of node type
switches, but they were often accomplished with tables
(e.g. describing whether an operator reads/writes/addresses its

I'd probably do it this way again unless I had a language that had
pattern-matching method calls.

mac the naf

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.