Re: UCS Identifiers and compilers

Marco van de Voort <marcov@stack.nl>
Thu, 11 Dec 2008 17:57:44 +0000 (UTC)

          From comp.compilers

Related articles
UCS Identifiers and compilers wclodius@los-alamos.net (2008-12-10)
Re: UCS Identifiers and compilers DrDiettrich1@aol.com (Hans-Peter Diettrich) (2008-12-11)
Re: UCS Identifiers and compilers mailbox@dmitry-kazakov.de (Dmitry A. Kazakov) (2008-12-11)
Re: UCS Identifiers and compilers james.harris.1@googlemail.com (James Harris) (2008-12-11)
Re: UCS Identifiers and compilers marcov@stack.nl (Marco van de Voort) (2008-12-11)
Re: UCS Identifiers and compilers idbaxter@semdesigns.com (Ira Baxter) (2008-12-11)
Re: UCS Identifiers and compilers bear@sonic.net (Ray Dillinger) (2008-12-11)
Re: UCS Identifiers and compilers cfc@shell01.TheWorld.com (Chris F Clark) (2008-12-11)
Re: UCS Identifiers and compilers bc@freeuk.com (Bartc) (2008-12-12)
Re: UCS Identifiers and compilers mike@mike-austin.com (Mike Austin) (2008-12-12)
| List of all articles for this month |

From: Marco van de Voort <marcov@stack.nl>
Newsgroups: comp.compilers
Date: Thu, 11 Dec 2008 17:57:44 +0000 (UTC)
Organization: Stack Usenet News Service
References: 08-12-061
Keywords: i18n
Posted-Date: 12 Dec 2008 10:20:46 EST

On 2008-12-11, William Clodius <wclodius@los-alamos.net> wrote:
> 1. Do many of your users make use of letters outside the ASCII/Latin-1
> sets?


For what? Literals/comments or also identifiers. This is IMHO a big
difference.


With FPC we afaik currently support the former, but not the
latter. There is some demand for the latter too (usually based on some
equality argument), but it is very hard to get an idea how big this
demand is, and how serious.


The project has no negative attitude towards such effort (except that
such big transition should be considered an extension for a major
version), but is waiting for sb with enough interest to do the work.


++++


(2 and 3 skipped since I'm not an user of non ascii identifiers, and I only
know the demo of Delphi 2009 that does, so couldn't compare either way)


> 4. How does the incorporation of the larger character sets affect your
> lexical analysis? Is hash table efficiency affected? Do you have to deal
> with case/accent independence and if so how useful are the UCS
> recommendations for languages?


Not, since we don't for anything that requires comparisons. We only
allow to set an encoding for literals. The encoding can be a classic
codepage or utf-8. There is some support for skipping BOMs, and


A handful tables of codepages (to unicode) are built in, additional
ones can be loaded. The compiler mostly works with a few minimal utf-8
en/decoding routines.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.