Re: looking for Lex/Bison unicode support

Dennis Ritchie <dmr@bell-labs.com>
21 Jan 2000 00:42:00 -0500

          From comp.compilers

Related articles
looking for Lex/Bison unicode support porky72@hotmail.com (Yaron Bracha) (2000-01-19)
Re: looking for Lex/Bison unicode support qjackson@wave.home.com (Quinn Tyler Jackson) (2000-01-21)
Re: looking for Lex/Bison unicode support dmr@bell-labs.com (Dennis Ritchie) (2000-01-21)
Re: looking for Lex/Bison unicode support chet@watson.ibm.com (2000-01-23)
Re: looking for Lex/Bison unicode support webid@asi.fr (Armel) (2000-02-04)
| List of all articles for this month |

From: Dennis Ritchie <dmr@bell-labs.com>
Newsgroups: comp.compilers
Date: 21 Jan 2000 00:42:00 -0500
Organization: Lucent Technologies, Columbus, Ohio
References: 00-01-081
Keywords: lex, i18n

Yaron Bracha asked:
>
> Does anybody knows a flex/bison compatible parsing tools that support
> unicode and generate c++ code ?...
>
And the moderator remarked
> [Yacc and its clones parse tokens, not characters, so they shouldn't
> be a problem, give or take nits like passing through non-ASCII strings
> in C action routines correctly. Lex or flex is harder since all of
> the implementations I know of use the character codes as indexes into
> tables to implement the lex state machine. But if you do that for
> Unicode, you'll have 64K entry tables rather than 256 entry tables and
> severe program bloat. I believe that plan 9 has a Unicode lex,
> presumably with some hackery to keep the table sizes down. -John]


Plan 9 Unicode lex: no, I'm afraid, just for the reason related
to the one John mentioned: no one developed the energy to be clever about
this program. The BUGS section (even now) says
    Cannot handle UTF.
    The asteroid to kill this dinosaur is still in orbit.


Dennis





Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.