Re: How to parse keywords that can be used as identifiers?

Jerry Leichter <leichter@smarts.com>
21 Aug 1996 18:56:56 -0400

          From comp.compilers

Related articles
How to parse keywords that can be used as identifiers? mark@research.techforce.nl (Mark Thiehatten) (1996-08-19)
Re: How to parse keywords that can be used as identifiers? anton@a0.complang.tuwien.ac.at (1996-08-20)
Re: How to parse keywords that can be used as identifiers? kanze@lts.sel.alcatel.de (1996-08-20)
Re: How to parse keywords that can be used as identifiers? leichter@smarts.com (Jerry Leichter) (1996-08-21)
Re: How to parse keywords that can be used as identifiers? ph@anweald.exnet.co.uk (1996-08-24)
Re: How to parse keywords that can be used as identifiers? grosch@cocolab.sub.com (1996-08-24)
Re: How to parse keywords that can be used as identifiers? dlmoore@ix.netcom.com (David L Moore) (1996-08-24)
Re: How to parse keywords that can be used as identifiers? itz@rahul.net (1996-08-24)
Re: How to parse keywords that can be used as identifiers? peter@bj-ig.de (Peter Brueckner) (1996-08-27)
Re: How to parse keywords that can be used as identifiers? stefan.monnier@lia.di.epfl.ch (Stefan Monnier) (1996-08-27)
[4 later articles]
| List of all articles for this month |
From: Jerry Leichter <leichter@smarts.com>
Newsgroups: comp.compilers
Date: 21 Aug 1996 18:56:56 -0400
Organization: System Management ARTS
References: 96-08-058 96-08-067
Keywords: parse, design

> Personally, I wouldn't design such a feature [ability to use keywords > as identifiers] in a new language. But you don't always have a
> choice.


I disagree. This is one case where people have ignored some very good
arguments by the designers of PL/I.


The designers of PL/I knew that they were producing a large language,
and they also intended to reach different groups of programmers who
would be working in different problem domains, hence most likely using
different subsets of the language. So one design goal was: A
programmer should not need to know about parts of the language that he
didn't intend to use. One concrete requirement that grew from this was
that keywords could not be reserved, since that would make every
programmer necessarily aware of every keyword in the language.


By the standards of the day, PL/I was huge. By today's standards, it
would be nothing exceptional. Ada is certainly larger by almost any
measure. The C++ language alone is larger - and that's even without
considering the standard class libraries.


Some languages - and certainly C and C++ are examples here - have tried
to avoid an explosion of keywords by simply re-using keywords in
different contexts. Consider all the different meanings of "static" in
C++, for example. While this may make the table of reserved words
smaller, it's not clear to me that it actually helps anyone in any real
sense.


A problem that I don't think the PL/I designers mentioned, but which can
also be significant, is that of extensions. If keywords are reserved,
any extension to the language that introduces a new keyword necessarily
makes some previously-legal programs illegal. Unfortunately, when it
comes time to choose a new keyword, the choice then comes down to:
Choose a meaningful word (which is more likely to be used as a variable
name *somewhere* - after all, how many different words relevent to
programming are there?), or use something like an abbreviation (making
the language harder to understand). The unpleasantness of this choice
is one of the things encouraging language designers to overload some
existing keyword!


With modern languages, much of the semantics of the language is not in
the statements themselves, but in the libraries. For typical object-
oriented languages, the descriptions of the libraries are many times the
size of the language descriptions! However, no one proposes that the
names of library classes, much less their members be reserved! If it's
so important to reserve the keywords, why not the library names?


As PL/I long ago demonstrated, it's quite possible to define a
reasonable programming-language grammar with *no* reserved words. It's
quite true that "bad examples" like the famous:


IF IF = THEN THEN THEN = ELSE ELSE ELSE = END END


are then possible. So? Reasonable programmers won't write such things;
few programmers will use a keyword *of which they are aware* as a
variable name. But what's the big deal if they use, say, UNSPEC as a
variable name, even if an advanced PL/I programmer knows that that's a
build-in pseudo-function (or whatever it's called)?


I've seen programming language designers go through all sorts of
contortions when they've created language extensions that (a) need
keywords; (b) only need them in a restricted, easily parsed, context;
(c) have to be "consistent" with an underlying language in which all
keywords are reserved. Sometimes they add a bunch of new keywords,
which then promptly parse to errors anywhere except within one syntactic
construct. (Great error-checking, that....) More often, they come up
with some syntactic convention to isolate the "keywords", either
requiring them to appear as string constants, or inside of extra
brackets (thus introducing a new quoting convention into the language,
just for this context), or whatever. What's being gained here?


-- Jerry
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.