Re: problems with identifiers and keywords...

glen herrmannsfeldt <gah@ugcs.caltech.edu>
17 Nov 2004 11:36:29 -0500

          From comp.compilers

Related articles
[6 earlier articles]
Re: problems with identifiers and keywords... wclodius@lanl.gov (2004-11-06)
Re: problems with identifiers and keywords... wyrmwif@tsoft.org (SM Ryan) (2004-11-07)
Re: problems with identifiers and keywords... vbdis@aol.com (2004-11-07)
Re: problems with identifiers and keywords... cfc@shell01.TheWorld.com (Chris F Clark) (2004-11-14)
Re: problems with identifiers and keywords... genew@mail.ocis.net (Gene Wirchenko) (2004-11-14)
Re: problems with identifiers and keywords... gah@ugcs.caltech.edu (glen herrmannsfeldt) (2004-11-17)
Re: problems with identifiers and keywords... gah@ugcs.caltech.edu (glen herrmannsfeldt) (2004-11-17)
Re: problems with identifiers and keywords... lkrupp@pssw.NOSPAM.com.INVALID (Louis Krupp) (2004-11-17)
Re: problems with identifiers and keywords... cfc@shell01.TheWorld.com (Chris F Clark) (2004-11-17)
Re: problems with identifiers and keywords... nmm1@cus.cam.ac.uk (2004-11-19)
Re: problems with identifiers and keywords... gah@ugcs.caltech.edu (glen herrmannsfeldt) (2004-11-19)
Re: problems with identifiers and keywords... gracjan@acchsh.nospam.com (Gracjan Polak) (2004-11-19)
Re: problems with identifiers and keywords... Martin.Ward@durham.ac.uk (Martin Ward) (2004-11-19)
[8 later articles]
| List of all articles for this month |

From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Newsgroups: comp.compilers
Date: 17 Nov 2004 11:36:29 -0500
Organization: Comcast Online
References: 04-10-148 04-10-170 04-10-174 04-11-008 04-11-011 04-11-031
Keywords: syntax, parse, design, comment
Posted-Date: 17 Nov 2004 11:36:29 EST

Chris F Clark wrote:
(snip)


> In any case, I think we are in "violent agreement" on this topic.


> My point was that if the language is hard to write a parser for, it is
> probably hard for humans to parse too. That doesn't mean that there
> aren't techniques which don't fit well with current lexer/parser
> generation technology that are not easy to parse (and easy to
> understand) <more on this in a second>. It's just that if you can't
> *easily* write a mechanical way to translate it, then a human
> problably isn't going to be able to understand it easier.


I am not sure I agree with this. Human languages are fairly hard for
machines to parse, yet presumably designed for humans to understand.
In a large number of cases, I don't believe humans have problems with
keywords used as identifiers, especially if those keywords are not
used by the program in question.


How would it be for a language to allow keywords as identifiers only
when they were not used as keywords in the given program? (presumably
within the scope of said identifier.)


PL/I sort of does this with functions and the BUILTIN attribute. Not
for statement keywords, though.


The C language has relatively few keywords, but all the library
function names are reserved.


(snip)


> For example, SGML put markup in <> delimiters (whence HTML and now XML
> does the same). However, SGML recognized that < and > might be useful
> in text and allowed one to use some notation I forget to change the
> delimiters. Most current lexer/parser generators cannot deal with
> that level of dynamicism.


TeX can, and reasonably often does. One has to be very careful,
though. Knuth describes TeX using the mouth and stomach analogy.
Some things are processed in the mouth, including applying catcodes to
characters, and some in the stomach, I believe including changes to
catcodes. Lookahead is carefully limited such that changes can be
made before being used, but one must be very careful.


(snip)


> Going back to positive cases, the "length prefix" notation, i.e. a
> number followed by that many "characters" of data is something else
> that most current lexser/parser generators don't do well on.


The old Fortran Hollerith FORMAT descriptor, and sometimes used for
character constants. Easy for humans to get the count wrong, and end
up with a mess.


(snip)


> The keyword as identifier feature was not the stumbling block to
> writing easy to maintain and error free PL/I programs (implicit
> conversions were the silently do the wrong thing issue).


Conversions are a problem in PL/I, but I am not sure that there is a
good solution. PL/I has a large variety of data types, and in many
cases the conversion makes sense. If one is used to a language with
only a few data types (just about every other one) it takes a while to
get used to, and one can easily make mistakes. If powerful features
were removed from languages because they can easily be misused by
beginners, what would we have left in usable languages?


If there were an obfuscated PL/I contest, like the IOCCC, maybe we
would be able to test out some of these questions.


Using CHAR variables in DO loops can be lots of fun, for example.
(The DO comparison is done as characters, not as numbers.) Maybe a
compiler warning "Are you sure about this?"


-- glen
[PL/I suffers from a lot of features that individually make sense but
do silly things in combination. My favorite example is this one:


DCL (A,B,C) CHAR(3);
A = '123'; B = '456'; C = A+B;


The value of C is three spaces, because the arithmetic result is
converted to a default length string ' 579' and then truncated
from the right. I don't know how you prevent this, either, although
in this case I have my doubts about the wisdom of implicit conversions
between arithmetic and string types. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.