Related articles |
---|
[20 earlier articles] |
Re: compiler for Chinese development language DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-10-23) |
Re: compiler for Chinese development language DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-10-23) |
Re: compiler for Chinese development language Robert@Knighten.org (Robert Knighten) (2005-10-26) |
Re: compiler for Chinese development language nmh@t3x.org (Nils M Holm) (2005-10-26) |
Re: compiler for Chinese development language owong@castortech.com (Oliver Wong) (2005-10-26) |
Re: compiler for Chinese development language owong@castortech.com (Oliver Wong) (2005-10-26) |
Re: compiler for Chinese development language henry@spsystems.net (2005-10-27) |
Re: compiler for Chinese development language henry@spsystems.net (2005-10-27) |
Re: compiler for Chinese development language gah@ugcs.caltech.edu (glen herrmannsfeldt) (2005-10-28) |
Re: compiler for Chinese development language choudhary@indicybers.net (Abhishek Choudhary) (2006-01-12) |
From: | henry@spsystems.net (Henry Spencer) |
Newsgroups: | comp.compilers |
Date: | 27 Oct 2005 23:24:35 -0400 |
Organization: | SP Systems, Toronto, Canada |
References: | 05-10-085 05-10-122 05-10-146 05-10-173 |
Keywords: | i18n, comment |
Posted-Date: | 27 Oct 2005 23:24:35 EDT |
Oliver Wong <owong@castortech.com> wrote:
>...It's not too bad to memorize an alphabet. With English, that's
>only 52 characters (you have to learn both the uppercase and lowercase
>version of every character, as they differ significantly)...
The number is actually a bit higher than that, because there are a few
letters which vary in basic shape from font to font; English speakers
are so used to this that they seldom notice it, but newcomers have to
learn the variations as separate forms. In italics, "f" grows a tail,
"Q"'s tail grows to almost an underline, and "a" is a completely
different shape (loop with a slight tail, rather than low loop with a
roof over it); in Helvetica and a lot of other sans-serif fonts, "g"'s
tail is a line rather than a loop. Printer and terminal(-emulation)
fonts pick one or the other almost at random.
(It's easy to dismiss the changes in tails in particular as trivial, but
differences between letters often are no bigger -- "j" is just "i" with
a tail, for example.)
>Even the Japanese Katakana alphabet has around 100 characters...
> Incidentally, the Japanese Katakana alphabet has a completely
>unambiguous pronounciation: Each chararacter represents one syllable...
The two facts are, of course, connected: you need more characters to
give each syllable a unique one.
By the way, a linguistic nitpick: technically an alphabet is a writing
system with (approximately) one character per sound, like the English one,
and Katakana is a syllabary, not an alphabet. People who independently
invent writing systems generally invent either syllabaries or ideographic
systems; it appears that the alphabet concept was invented just once, by
some obscure neighbors of the Phoenicians, and all other alphabets derive
at least inspiration from theirs.
--
spsystems.net is temporarily off the air; | Henry Spencer
mail to henry at zoo.utoronto.ca instead. | henry@spsystems.net
[Fascinating though this discussion is, it's veered away from compilers, so
unless someone can tell us about compiling kana into kanji or something,
this thread is at an end. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.