|Input-driven lexical scanner firstname.lastname@example.org (2000-12-18)|
|Re: Input-driven lexical scanner email@example.com (Chris F Clark) (2000-12-19)|
|Re: Input-driven lexical scanner firstname.lastname@example.org (2000-12-20)|
|Re: Input-driven lexical scanner email@example.com (2000-12-20)|
|From:||firstname.lastname@example.org (Olaf Titz)|
|Date:||18 Dec 2000 00:46:32 -0500|
|Organization:||Private site, southern Germany|
|Keywords:||question, lex, comment|
|Posted-Date:||18 Dec 2000 00:46:32 EST|
The control/data flow of all lexical scanner generators I know goes
like this: when the scanner needs more input, it calls a routine which
provides the input and doesn't return until enough data is available.
Usually this ends up in fgets() or read() somehow.
An application I'm writing needs the opposite way: the scanner routine
gets called with an input buffer as argument (variable size - may even
contain only a single byte). It has to process this, and on completion
of a token, call a processing routine with that token as argument. It
is not possible to block or wait for input in any way.
Is there any scanner generator which is able to do this? I've
experimented with re2c, so that its YYFILL macro completely resets the
state and returns to the caller with a special value meaning "I need
more data", then the caller can fill the buffer and restart the
scanner. Problems are, (a) resetting is not as efficient as I'd like,
and (b) it doesn't work reliably; perhaps my buffer management is
subtly wrong or I don't completely understand the real meaning of
YYCURSOR and YYMARKER. (E.g. is it right that YYMARKER <= YYCURSOR?
This is nowhere documented.)
Other than that, I think re2c is already the right tool: extremely
lightweight, reentrant (necessary!), target language is C (not C++).
[This question has come up several times recently for people who want to
run a lexer in a GUI application. I think re2c is about as good as it
Return to the
Search the comp.compilers archives again.