Re: Statement at a time parsing with yacc (Bart Massey)
Tue, 10 Dec 1991 21:21:38 GMT

          From comp.compilers

Related articles
Statement at a time parsing with yacc (1991-12-06)
Re: Statement at a time parsing with yacc (1991-12-10)
Re: Statement at a time parsing with yacc (1991-12-10)
Re: Statement at a time parsing with yacc (1991-12-11)
Statement at a time parsing with yacc compres! (1991-12-12)
Re: Statement at a time parsing with yacc (1991-12-17)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Bart Massey)
Keywords: yacc, lex, parse
Organization: CIS Dept., University of Oregon
References: 91-12-036
Date: Tue, 10 Dec 1991 21:21:38 GMT

In article 91-12-036 (Przemek Skoskiewicz) writes:
> Is it possible to trick yacc to go into this "statement by statement" mode
> of operation? This has to be done dynamically, i.e. sometimes I want to
> parse a whole file at once and sometimes I want it to do it statement by
> statement with the parser returning control to me (yyparse) after each
> statement is parsed.

Well, if you can get control returned after each statement, it's pretty
trivial to loop parsing statements -- I'm not sure exactly what you mean
by "dynamically" here. You can probably get my example below to do what
you want, if it doesn't already, by clever use of if() clauses in the YACC
actions and a bit of LEX hacking...

> [In some conversation with him, it came out that the hard part is that the
> lexer can't tell where the statement boundaries are, so it can't easily
> return the EOF token at the end of a statement. I already suggested YYACCEPT
> and looking at yychar to see if there is a lookahead token. Any better
> ideas? -John]

Here's my hack -- I'm not sure it's "better", though :-). It *is* a bit
more portable, as it uses only documented YACC features (although it
*does* depend on the undocumented LEX feature that the BEGIN macro may be
invoked from anywhere in the file, one could always write a hand lexer if
one had to -- and besides, I've never seen a LEX-alike that didn't allow

The basic idea is to put the lexer in a state where it will return a 0
(EOF) anytime the parser tells it that the end of a statement has been
reached. The statements in my sample are context-free to illustrate that
the parser is doing what it needs to. I used BYACC for this, as I don't
trust YACC to do the right thing with default reductions, lookahead, and
yyclearin (I've had this sort of troubles before).

The key is to make sure that you invoke yyclearin at the end of the
statement and to also make sure that the lexer will then deliver the
correct token even if it has already delivered it once (as a lookahead
token). Note that the lexer reads a character and unputs it only because
LEX won't let one do the right thing -- this hack still works if tokens
are arbitrarily long, but you need to replace the . with a .* and unput
*all* the characters...

Bart Massey

---cut here---
#! /bin/sh
# This is a "shar" archive. To extract, clip off any leading or
# trailing garbage and execute using the Bourne Shell.
echo "creating Makefile"
sed 's/^X//' >Makefile <<'X'
all: statg statg.y
byacc -d statg.y

lex.yy.c: statg.l
lex statg.l


statg: lex.yy.o
cc -o statg lex.yy.o
echo "creating statg.l"
sed 's/^X//' >statg.l <<'X'
#include ""



<STOP>. { unput(yytext[0]); BEGIN 0; return 0; }
a return A;
b return B;
\n /* do nothing */;


int yywrap() {
return 1;

end_statement() {
echo "creating statg.test"
sed 's/^X//' >statg.test <<'X'
echo "creating statg.y"
sed 's/^X//' >statg.y <<'X'
extern end_statement();
int done = 0;

%token A B


start: statement { end_statement(); yyclearin; } | { done = 1; } ;
statement: A statement B | A B ;


int yyerror( s )
char *s;
printf( "\n%s\n", s );
exit( 1 );

int main() {
while( !done )
printf( "%d\n", yyparse() );
return 0;
exit 0

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.