Related articles |
---|
C symbol table structures mac@coos.dartmouth.edu (1999-01-02) |
Re: C symbol table structures kadhim@lucent.com (Basim Kadhim) (1999-01-04) |
Re: C symbol table structures heinrich@idirect.com (1999-01-06) |
Re: C symbol table structures drh@microsoft.com (Dave Hanson) (1999-01-11) |
Re: C symbol table structures resslere@erols.com (Gene Ressler) (1999-01-17) |
From: | "Gene Ressler" <resslere@erols.com> |
Newsgroups: | comp.compilers |
Date: | 17 Jan 1999 20:52:28 -0500 |
Organization: | Compilers Central |
References: | 99-01-005 |
Keywords: | symbols |
I have tried this kind of "shallow binding," a "stack of symbol
tables," and a third approach that led to the cleanest and fastest
code, at least in an environment where inherited attributes were
suppored for the parser (sorry yacc'ers).
I suppose you'd say it's "Lispish" or "functional". All the info
normally kept in the symbol table except the id string goes into IR
nodes, in my case AST "declaration" nodes. A symbol table node is
just an id and a pointer to such a node. Each declaration node points
to its parent "block" node (procedure, function, scope block, etc.);
each block to its symbol table. A symbol table is a dictionary of
symbol table nodes, a pointer to the table of the enclosing scope
(null for outermost), and a pointer to its block. These pointers seem
adequate to do all the usual lookups quickly. If you are pressed for
space, you can allocate from arrays and use 2 byte node numbers rather
than 4 byte pointers.
I know all this is pretty standard. The thing I've not seen before is to
use the symbol table as an explicit inherited attribute param in the parser
rather than fooling around with a stack. So in a top-down parser you get
procedures like
parse_function_declaration(ST st1) {
ST st2 = parse_type(st1); // Parse function return type (may add new struct!)
match(IDENTIFIER, st2); // Get the function name into the symtab
ST st3 = new ST(st2); // Make a new table and chain it onto the old one.
parse_params(st3); // New block so get params into the new table!
parse_body(st3); // Still in new block; use new table
parse_rest_of_decls(st2); // Parse the rest with the old table
}
I.e. the code for managing symbols tends to read like the language
spec. You avoid an explicit stack, so you don't have to remember to
pop it. Notwithstanding intuition, a good compiler optimizes away
much of the st param passing.
The kicker as with all "one st per scope" solutions is coming up with
a st structure that is very small when storing small numbers of IDs
(like struct scopes).
Gene
Alex Colvin wrote in message 99-01-005...
>I've been working on a compiler for C* (a dialect of C).
>
>I'm using a symbol table structure that I think I heard about here
>(comp.compilers) long ago. Can anyone tell me if this has a name, or if
>i got it right?
>
>Each scope (block) is numbered, in increasing order. Each identifier
>has a list of definitions and scope numbers.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.