Related articles |
---|
Cmajor 1.4.0 released seppo.laakko@pp.inet.fi (Seppo Laakko) (2016-03-13) |
Re: Cmajor 1.4.0 released aaronngray@gmail.com (Aaron Gray) (2016-03-25) |
Re: Cmajor 1.4.0 released seppo.laakko@pp.inet.fi (Seppo Laakko) (2016-03-29) |
From: | "Seppo Laakko" <seppo.laakko@pp.inet.fi> |
Newsgroups: | comp.compilers |
Date: | Tue, 29 Mar 2016 15:24:20 +0300 |
Organization: | Compilers Central |
References: | 16-03-002 16-03-008 |
Injection-Info: | miucha.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="29868"; mail-complaints-to="abuse@iecc.com" |
Keywords: | tools, i18n, C |
Posted-Date: | 30 Mar 2016 20:02:48 EDT |
It has not complete Unicode support. I should have written "some support for
Unicode".
It is assumed that the encoding of Cmajor source files is UTF-8, so the
string literals are in UTF-8 encoding.
In Windows, the UTF-8 string literals are converted to UTF-16 for output.
In Linux, the string literals are output as is, because the UTF-8 locale can
correctly handle them.
There are three character types: char (8-bit unsigned), wchar (16-bit
unsigned) and uchar (32-bit unsigned) for storing 8-bit, UTF-16 and UTF-32
character values respectively.
There are also three string types:
string, a typedef for System.String<char> for representing ASCII and UTF-8 strings,
wstring, a typedef for System.String<wchar> for representing UTF-16 strings and
ustring, a typedef for System.String<uchar> for representing UTF-32 strings.
The character classification functions in System.Unicode namespace like
IsLower(uchar c) use Unicode character information.
The string conversion functions ToUpper and ToLower use Unicode character
information.
For example, ToUpper converts Finnish characters like
LATIN SMALL LETTER A WITH DIAERESIS, 0xE4 to uppercase
LATIN CAPITAL LETTER A WITH DIAERESIS, 0xC4.
Seppo
-----Alkuperdinen viesti-----
From: Aaron Gray
Sent: Friday, March 25, 2016 8:02 PM Newsgroups: comp.compilers Subject: Re:
Cmajor 1.4.0 released
On Tuesday, 15 March 2016 12:10:09 UTC, Seppo Laakko wrote:
> New in this release:
> o Unicode support.
How gooder Unicode support is this? There are lots of incomplete Unicode
libraries or is this just character literal support ?
Aaron
Return to the
comp.compilers page.
Search the
comp.compilers archives again.