UCS Identifiers and compilers

wclodius@los-alamos.net (William Clodius)
Wed, 10 Dec 2008 22:16:23 -0700

          From comp.compilers

Related articles
UCS Identifiers and compilers wclodius@los-alamos.net (2008-12-10)
Re: UCS Identifiers and compilers DrDiettrich1@aol.com (Hans-Peter Diettrich) (2008-12-11)
Re: UCS Identifiers and compilers mailbox@dmitry-kazakov.de (Dmitry A. Kazakov) (2008-12-11)
Re: UCS Identifiers and compilers james.harris.1@googlemail.com (James Harris) (2008-12-11)
Re: UCS Identifiers and compilers marcov@stack.nl (Marco van de Voort) (2008-12-11)
Re: UCS Identifiers and compilers idbaxter@semdesigns.com (Ira Baxter) (2008-12-11)
Re: UCS Identifiers and compilers bear@sonic.net (Ray Dillinger) (2008-12-11)
[3 later articles]
| List of all articles for this month |

From: wclodius@los-alamos.net (William Clodius)
Newsgroups: comp.compilers
Date: Wed, 10 Dec 2008 22:16:23 -0700
Organization: Compilers Central
Keywords: i18n, question, design
Posted-Date: 11 Dec 2008 04:26:35 EST

As a hobby I have started work on a language design and one of the
issues that has come to concern me is the impact on the usefulness and
complexity of implementation is the incorporation of UCS/Unicode into
the language, particularly in identifiers. Most languages these days
seem to be trying to exploit UCS including C, C++, Ada, Haskell, and
Scheme, although there are at least a few holdouts such as Fortran. As
posters to this newsgroup are both users, implementors and language
designers with a bit more contact with the outside world I would like
responses to the following questions


1. Do many of your users make use of letters outside the ASCII/Latin-1
sets?


2. What are the most useful development environments in terms of dealing
with extended character sets?


3. Visually how well do alternative character sets mesh with a language
with ASCII keywords and left to right, up and down display, typical of
most programming languages? eg. how well do scripts with ideographs,
context dependent glyphs for the same character, and alternative saptail
ordering work, or character sets with characters with glyphs similar to
those used for ASCII (the l vs 1 and O vs. 0 problem multiplied)


4. How does the incorporation of the larger character sets affect your
lexical analysis? Is hash table efficiency affected? Do you have to deal
with case/accent independence and if so how useful are the UCS
recommendations for languages?



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.