Should scanners see symbol table ? (was Re: Lookahead vs. Scanner Feedback)

blakemor@software.org (Alex Blakemore)
Wed, 8 Jan 1992 17:20:02 GMT

          From comp.compilers

Related articles
Re: Lookahead vs. Scanner Feedback smk@dcs.edinburgh.ac.uk (1992-01-07)
Should scanners see symbol table ? (was Re: Lookahead vs. Scanner Feed blakemor@software.org (1992-01-08)
Re: Should scanners see symbol table ? (was Re: Lookahead vs. Scanner bliss@sp64.csrd.uiuc.edu (1992-01-10)
| List of all articles for this month |
Newsgroups: comp.compilers
From: blakemor@software.org (Alex Blakemore)
Summary: it is simpler and cleaner if scanner does not lookup ids in symbol table
Keywords: parse, C
Organization: Software Productivity Consortium, Herndon, Virginia
References: 92-01-032
Date: Wed, 8 Jan 1992 17:20:02 GMT

bliss@sp64.csrd.uiuc.edu (Brian Bliss) writes:
> One place where every yacc/lex based C compiler I know of is
> broken is on a typedef name redefined in an inner scope:
>
> typedef int foo;
>
> main ()
> {
> char foo;
> }


/* start politically incorrect statement */
I may misunderstand, but it seems to me the problem is aggravated by
having the scanner read the symbol table to decide what kind of token foo
is. If so, that seems to me to violate software engineering principles -
which in several cases leads to problems. The scanner should only know
about lexical information - and lexically foo is just an identifier. The
parser knows about levels of scope, context etc and should look up
identifiers in the symbol table(s) to determine what they denote. I know
the dragon book suggests a design where the scanner reads the symbol
table, but that seems to me to be fraught with difficulty in block
structured languages with nesting etc - and even more so in languages like
Ada with multiple imported symbol tables (or ML with its different name
spaces which depend on the context.) For a simple language like C with an
almost flat name space - it probably works fine to follow the dragon
design.


So foo should be sent to the parser as an identifier token (with a way of
recovering the string (or a code for the string) from the token, and not
as type_identifier token or a label_identifier token


I have used this approach several times since it was taught to me at school
(we used another text)


What do real compilers do, esp for other languages besides C ?


P.S. The dragon book is a fine text in many ways, I just wonder about this
one detail.
--
Alex Blakemore blakemore@software.org (703) 742-7125
Software Productivity Consortium 2214 Rock Hill Rd, Herndon VA 22070
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.