Re: How to restart FLEX scanners?

rkrayhawk@aol.com (RKRayhawk)
16 Feb 1999 23:22:50 -0500

          From comp.compilers

Related articles
How to restart FLEX scanners? jenglish@flightlab.com (1999-02-03)
Re: How to restart FLEX scanners? gneuner@dyn.com (1999-02-12)
Re: How to restart FLEX scanners? rkrayhawk@aol.com (1999-02-16)
Re: How to restart FLEX scanners? jenglish@flightlab.com (1999-02-18)
| List of all articles for this month |

From: rkrayhawk@aol.com (RKRayhawk)
Newsgroups: comp.compilers
Date: 16 Feb 1999 23:22:50 -0500
Organization: AOL http://www.aol.com
References: 99-02-013
Keywords: lex

A few additional minor comments about a procedure posted by
jenglish@flightlab.com (Joe English)
On 3 Feb 1999 23:55:26 -0500


which as planned might under some conditions be invoked recursively ...


(snippetes)
<<


extern PARSE_TREE parse_tree; /* set by yyparse() %start rule */
int parse_file(FILE *fp)
{
YY_BUFFER_STATE old = YY_CURRENT_BUFFER;
YY_BUFFER_STATE new = yy_create_buffer(fp, ...);
yy_switch_to_buffer(new);
while (yyparse() != EOF_SIGNAL)
evaluate(parse_tree);
yy_delete_buffer(new);
if (old != NULL)
yy_switch_to_buffer(old);
}


>>


a) Not to nitpick but you may wish to avoid use of a dataname like
'new' in code that might be processed by a modern C compiler, as it is
a reserved word under some compilation conditions,


b) A possibly helpful conceptual reference to the recommendation that
you not recursively invoke yylex() or yyparse() is that each file in
the compilation unit is not a distinct parse. That is, a parsing
corresponds to a compilation unit, which may include more than one
file.


c) The code you show seems to imply that you wish to manage the files
from a high level and pass that managed context to the lexer and
parser. Under some design strategies this violates the territory of
the files as they are (frequently but not always) the domain of the
lexer. Often a parser cannot even see the linefeeds much less the EOFs
of include files.


d) In languages that allow file includes or file copy statements, and
laguages that allow macro expansions to come from files other than the
current source file, file processing can become very deep and very
complex. Files back up the list may need to be closed (and at some
point re-opened) to allow for the resources for the current file to be
accessed. Generally there is no advantage to making the management of
these matters visible to even the lexer much less the parser. It is
better to push and pop the file context from beneath the lexer,
IMHO. This will permit you to switch entirely back to a previous file
context (upon any given EOF) prior to hitting any additional
executable code.


d) If you do not shut down the previous file context early enough,
then the bottom of the lexer could be processing with stale
information about file context; as could the code in the invoker of
the lexer just below the lexer invocation, etc. and the code at the
top of the mainline you have displayed in your post (where you do not
show any code before invocation of the yy_switch_to_buffer(new);
routine, but where you someday may wish to have some preceding code).


e) Further some compilers might be designed to allow multiple
compilation units. _That_ sequence of first level files should be
managed at approximately the level of code you are currently puzzling
over. So if you attempt to allow multiple compilation units, all the
code and data structures necessary for that would collide with your
current code.


f) If your parser has need of awareness of the EOFs at include file
boundaries, you may wish to distinguish them from the real final EOF,
and just pass them as a dsitinct token. This might be necessary only
if those intermediary EOFs have some kind of semantic force. Another
way to look at the over-all develop effort is to understand that when
you have a choice you would prefer not to publish a syntaxt that
grants significance to the intermediate EOFs. But that is not a
prior. It would not be outrageous to insist that C language open and
close curly brackets not be in two distinct files, for example,
although the standard has not evolved that way. So sometimes a syntax
needs to know where intermediate EOFs are to impose a particular
decorum, but even then it is perhaps best to think of detection of
intermediate EOFs as a subfunction within the domain of the lexer.


Best Wishes,


Robert Rayhawk
RKRayhawk@aol.com


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.