Re: Symbol-Table and Parse-trees (RKRayhawk)
13 Aug 1999 01:14:29 -0400

          From comp.compilers

Related articles
Symbol-Table and Parse-trees (1999-08-12)
Re: Symbol-Table and Parse-trees (1999-08-13)
Re: Symbol-Table and Parse-trees (Leif Leonhardy) (1999-08-15)
Re: Symbol-Table and Parse-trees (1999-08-15)
| List of all articles for this month |

From: (RKRayhawk)
Newsgroups: comp.compilers
Date: 13 Aug 1999 01:14:29 -0400
Organization: AOL
References: 99-08-048
Keywords: symbols, parse

  Peter Palotas,

<< I'm wondering what the best way is to handle declarations in a C
compiler? During the initial parse (sytatic analysis), should one put
the declarations in the symbol-table immediately, and not in the parse
tree, or should one wait and put them in the sym-tbl only during
semantic analysis?

Also, how should one cope with initialazion, (ex. int i = 3 * 2 + k;)
and scopes?

Some thoughts on this would be great!

Well, your questions are great.

First, I would lay out atleast three phase to address your first question:
    lexical analysis
    syntactic analysis
    semantic analysis
((you could be even more detailed in your phasing)).

The symbol table can be owned by lexical analysis! The lexer can at
least benefit from access to the symbol table to empower it to inform
the parser that the current token IS a symbol.

The fact that a lexical item does not conform to some known pattern
makes it a candidate for characterization as a symbol when it is to be
handed to the parser. This usually is a fact
free_of_the_context_of_the_parse. That is, a chunk of text looks a lot
like a symbol even before we try to fit it into the syntax.

So sometimes, you will see a strategy where the lexer detects,
instantiates, and recognizes references to symbols. Sometimes, the
lexer owns the symbol table, letting the parser only view it.

Each symbol has a scope, meaning it is not visible forever. So you
need to generate scope identifiers of some sort, and associate each
occurence of a symbol with that scope id. When the scope ends (a
syntactic issue), the relevant symbols need to be torn out of the
symbol table.

In C there are global type variables that can be specified outside of
functions, their scope ends when the current compilation
ends. Otherwise, every close curly bracket ends a scope (either ending
a function or a bracketted subsection therein).

I am probably not good enough to take on the question about initializing
    int i = 3 * 2 + k;
in a compact format like a newsgroup. But I will over you the hint that this
excellent example could represent two VERY distinct challenges for you.

If k is a constant, then you have an initialization of a variable to a
constant value. If k is another variable, then you have initialization
to a memory based dynamic value, or maybe a syntax error :-). These
both must be handled differently for stack based auto data items as
opposed to global data declared outside of a function. And noting the
flexibility of scoping with curly brackets in C, you may distinguish
several stack based declaration situations:
  1) declared in a function before any executable code
  2) declared in a function after some executable code but still essentially
just at level 0 within the function
  3) declared within brackets within a function

The main difference relating to whether you are tracking the need for
track stack resources across the whole function, or if you will
push/pop the bracketted items as you go along in the code emissions
within the function. Execution time trade-offs aside, your grammar
productions will probably have to be detailed enough to distinguish
these situations.

Best Wishes,

Robert Rayhawk

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.