Re: compiler for Chinese development language

haberg@math.su.se (Hans Aberg)
14 Oct 2005 17:23:19 -0400

          From comp.compilers

Related articles
compiler for Chinese development language gentlezhao@126.com (gentlezhao) (2005-10-13)
Re: compiler for Chinese development language owong@castortech.com (Oliver Wong) (2005-10-14)
Re: compiler for Chinese development language haberg@math.su.se (2005-10-14)
Re: compiler for Chinese development language sgganesh@gmail.com (Ganny) (2005-10-17)
Re: compiler for Chinese development language djg@tramontana.co.hu (DEÁK JAHN, Gábor) (2005-10-19)
Re: compiler for Chinese development language gentlezhao@126.com (gentlezhao) (2005-10-19)
Re: compiler for Chinese development language torbenm@app-6.diku.dk (2005-10-19)
Re: compiler for Chinese development language owong@castortech.com (Oliver Wong) (2005-10-19)
Re: compiler for Chinese development language Juergen.Kahrs@vr-web.de (=?ISO-8859-1?Q?J=FCrgen_Kahrs?=) (2005-10-19)
[21 later articles]
| List of all articles for this month |

From: haberg@math.su.se (Hans Aberg)
Newsgroups: comp.compilers
Date: 14 Oct 2005 17:23:19 -0400
Organization: Mathematics
References: 05-10-085
Keywords: i18n

  "gentlezhao" <gentlezhao@126.com> wrote:


> Hi, I am a student from China.I want to design a compiler for Chinese
> development language on linux, but I don't know whether it is
> feasible. I need some suggestion.


> [I don't see why not. Unicode support is getting pretty good, and once you're
> pass the lexical stage, the source character set doesn't affect the language.
> -John]


The lexical stage can be handled in Flex, if one feeds it with a UTF-8
files, and writes out the character rule matches explicitly using a UTF-8
editor. (I have started to do this with math characters.) For character
classes, one can do a translation of Unicode character classes into Flex
regular expressions, matching UTF-8. I posted some such conversion
functions in the Flex mailing list. The same technique should be
applicable to just about any standard 8-bit byte based lexer generator.


--
    Hans Aberg


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.