Re: Buffered input for a lexer?

Ray Dillinger <bear@sonic.net>
10 Apr 2002 00:12:30 -0400

          From comp.compilers

Related articles
[6 earlier articles]
Re: Buffered input for a lexer? cfc@world.std.com (Chris F Clark) (2002-03-25)
Re: Buffered input for a lexer? clint@0lsen.net (2002-03-31)
Re: Buffered input for a lexer? sabre@nondot.org (Chris Lattner) (2002-03-31)
Re: Buffered input for a lexer? sabre@nondot.org (Chris Lattner) (2002-03-31)
Re: Buffered input for a lexer? joachim_d@gmx.de (Joachim Durchholz) (2002-03-31)
Re: Buffered input for a lexer? cgweav@aol.com (2002-03-31)
Re: Buffered input for a lexer? bear@sonic.net (Ray Dillinger) (2002-04-10)
Re: Buffered input for a lexer? bear@sonic.net (Ray Dillinger) (2002-04-10)
Re: Buffered input for a lexer? cgweav@aol.com (2002-04-13)
Re: Buffered input for a lexer? ralph@inputplus.co.uk (2002-04-16)
Re: Buffered input for a lexer? joachim_d@gmx.de (Joachim Durchholz) (2002-04-16)
Re: Buffered input for a lexer? cgweav@aol.com (2002-04-17)
Re: Buffered input for a lexer? rhyde@cs.ucr.edu (Randall Hyde) (2002-04-19)
[5 later articles]
| List of all articles for this month |

From: Ray Dillinger <bear@sonic.net>
Newsgroups: comp.compilers
Date: 10 Apr 2002 00:12:30 -0400
Organization: Compilers Central
References: 02-03-162
Keywords: lex
Posted-Date: 10 Apr 2002 00:12:30 EDT

Chris Lattner wrote:
>
> Are there any well known techniques that are useful to provide buffered input
> for a lexer, without imposing arbitrary restrictions on token size?


Well, there is the possibility of dynamic buffer resizing.


> [Flex uses a pair of large input buffers, 16K by default, with each token
> having to be smaller than a buffer. For anything vaguely resembling a
> programming language, I'd think a 16K token limit wouldn't be a problem....]


Just as a pestiferous hypothetical counterexample, what if there were
a language that allowed literal image, animation, and sound values as
immediate inline constants? You would need some kind of "aware"
editor that didn't attempt to display them as text, but conceptually,
writing an image of how a particular dialog box is supposed to look in
the source code is no different from writing the number 3. They are
just constants. One gets a one-byte representation, and the other
requires more than one byte.


There are no extant examples of languages so cavalier with
multi-kilobyte constants yet, but I expect it in the next five years.
Having them be immediate instead of keeping a resource library around
and referring to them by number or whatever would just make it easier
to work with.


Meanwhile, and as others here have pointed out, C++ name manglers, in
the presence of macros that expand into class definitions, can
blithely create useless 10-kilobyte identifiers.


Bear
[If there were such a language, I think I'd write my lexer with some
special case code to suck up the big bags o' bits. There's plenty
of ways to read such stuff in pieces and then glue them together. -John]



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.