Re: Why put type information into syntax?

Keith Thompson <kst@cts.com>
1 Apr 2000 14:08:40 -0500

          From comp.compilers

Related articles
Why put type information into syntax? across@vega.co.uk (Allister Cross) (2000-03-25)
Re: Why put type information into syntax? michael.prqa@indigo.ie (Michael Spencer) (2000-03-28)
Re: Why put type information into syntax? lex@cc.gatech.edu (2000-03-28)
Re: Why put type information into syntax? RobertADuffbobduff@world.std.com> (2000-03-28)
Re: Why put type information into syntax? tlh20@cam.ac.uk (Tim Harris) (2000-04-01)
Re: Why put type information into syntax? kst@cts.com (Keith Thompson) (2000-04-01)
Re: Why put type information into syntax? michael.prqa@indigo.ie (Michael Spencer) (2000-04-05)
Re: Why put type information into syntax? rod.bates@wichita.boeing.com (Rodney M. Bates) (2000-04-05)
Re: Why put type information into syntax? kst@cts.com (Keith Thompson) (2000-04-11)
Re: Why put type information into syntax? idbaxter@semdesigns.com (Ira D. Baxter) (2000-04-14)
Re: Why put type information into syntax? world!bobduff@uunet.uu.net (Robert A Duff) (2000-04-14)
Re: Why put type information into syntax? maratb@CS.Berkeley.EDU (Marat Boshernitsan) (2000-04-15)
[1 later articles]
| List of all articles for this month |

From: Keith Thompson <kst@cts.com>
Newsgroups: comp.compilers
Date: 1 Apr 2000 14:08:40 -0500
Organization: CTS Network Services
References: 00-03-133 00-03-146
Keywords: types, parse

Michael Spencer <michael.prqa@indigo.ie> writes:
> Allister Cross wrote:
> >
> > Does anyone know of any reasons why built-in type names should be
> > incorporated in the syntax of a language. I have been looking at the
>
> I can think of two reasons, first, in C and C++ I know, built-in type
> names are keywords. By enforcing this rule in the grammar your parser
> becomes a lot cleaner.


It does? No offense, but are you speaking from experience?


In C (and C++), a typedef name effectively becomes a keyword once it's
been declared -- unless it's redeclared as a new identifier in an
inner scope.


        void example1(void)
        {
                typedef int FOO; /* FOO is now a typedef name */
                char *s;


                s = 42; /* semantic error (type mismatch) */
                BAR = 42; /* semantic error (undeclared identifier) */
                FOO = 42; /* syntax error (!) */
        }


        void example2(void)
        {
                typedef int FOO; /* FOO is now a typedef name */


                FOO x; /* FOO is a typedef name */
                { /* inner scope */
                        FOO y; /* FOO is still a typedef name */
                        int FOO; /* Now FOO is an identifier */
                }
        }


Some of the problems with this:


1. Not only the parser, but the lexical analyzer, typically requires
      feedback from the symbol table to determine whether a given
      identifier is a typedef name.


2. In example2(), when the compiler encounters "int FOO;", it has to
      infer from the context that FOO is to be treated as a newly
      declared identifier rather than as a typedef name.


Misuse or misspelling of typedef names typically causes (what the
compiler perceives as) syntax errors. In my experience, C compilers
tend not to be very good at recovering from syntax errors; no doubt
some are better at this than others. Recovery from semantic errors is
relatively easy and usually results in much clearer error messages.


I worked on a special-purpose C parser at a previous job. I never
quite got the handling of typedef names right; I'm sorry to say that I
just kludged it to work with the particular source code I was trying
to parse (which was just header files).


I believe the treatment of typedef names as keywords was necessitated
by C's rather odd declaration syntax. (Typedefs weren't in the
original, pre-K&R1, version of the language.) With a simpler, and
perhaps slightly more verbose, declaration syntax, type names could
have been treated as ordinary identifiers, and "int", "char",
etc. could have been predefined identifiers rather than keywords.


Now, I can imagine a language that makes built-in type names keywords,
but makes user-defined type names ordinary identifiers. An advantage
of this is that you can't confusingly redefine the names in an inner
scope, but IMHO a better way to handle this would be add a
special-case rule that you can't redefine predefined identifiers.


There are numerous languages outside the C lineage in which the names
of the predefined types are ordinary identifiers, typically implicitly
declared in a (real or imaginary) scope that surrounds the user's
code. See Ada's package Standard for an example of this.


--
Keith Thompson (The_Other_Keith) kst@cts.com <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://www.sdsc.edu/~kst>


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.