Input-driven lexical scanner

olaf@bigred.inka.de (Olaf Titz)
18 Dec 2000 00:46:32 -0500

          From comp.compilers

Related articles
Input-driven lexical scanner olaf@bigred.inka.de (2000-12-18)
Re: Input-driven lexical scanner cfc@world.std.com (Chris F Clark) (2000-12-19)
Re: Input-driven lexical scanner vbdis@aol.com (2000-12-20)
Re: Input-driven lexical scanner rog@vitanuova.com (2000-12-20)
| List of all articles for this month |

From: olaf@bigred.inka.de (Olaf Titz)
Newsgroups: comp.compilers
Date: 18 Dec 2000 00:46:32 -0500
Organization: Private site, southern Germany
Keywords: question, lex, comment
Posted-Date: 18 Dec 2000 00:46:32 EST

The control/data flow of all lexical scanner generators I know goes
like this: when the scanner needs more input, it calls a routine which
provides the input and doesn't return until enough data is available.
Usually this ends up in fgets() or read() somehow.


An application I'm writing needs the opposite way: the scanner routine
gets called with an input buffer as argument (variable size - may even
contain only a single byte). It has to process this, and on completion
of a token, call a processing routine with that token as argument. It
is not possible to block or wait for input in any way.


Is there any scanner generator which is able to do this? I've
experimented with re2c, so that its YYFILL macro completely resets the
state and returns to the caller with a special value meaning "I need
more data", then the caller can fill the buffer and restart the
scanner. Problems are, (a) resetting is not as efficient as I'd like,
and (b) it doesn't work reliably; perhaps my buffer management is
subtly wrong or I don't completely understand the real meaning of
YYCURSOR and YYMARKER. (E.g. is it right that YYMARKER <= YYCURSOR?
This is nowhere documented.)


Other than that, I think re2c is already the right tool: extremely
lightweight, reentrant (necessary!), target language is C (not C++).


Olaf
--
Olaf.Titz@inka.de <URL:http://sites.inka.de/~bigred/>
[This question has come up several times recently for people who want to
run a lexer in a GUI application. I think re2c is about as good as it
gets. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.