Related articles |
---|
Best way to parse simple statement grammar ed_davis2@yahoo.com (ed_davis2) (2005-06-04) |
Re: Best way to parse simple statement grammar DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-06-08) |
From: | "ed_davis2" <ed_davis2@yahoo.com> |
Newsgroups: | comp.compilers |
Date: | 4 Jun 2005 15:13:39 -0400 |
Organization: | http://groups.google.com |
Keywords: | parse, question |
Posted-Date: | 04 Jun 2005 15:13:39 EDT |
I'm trying to figure out which is the best way to parse the following
simple grammar, using a hand-written parser:
stmtseq = {stmt}
stmt = "if" expr "then" stmtseq ["else" stmtseq "endif"]
| "while" expr "do" stmtseq "endwhile"
| "repeat" stmtseq "until" expr
Perusing available literature and sources of compilers, I have seen
basically three styles (for simplicity, eof processing is ignored):
1) stmtseq loops until any token in stmt's follow-set is found:
sub stmtseq()
while not (token in [t_endwhile, t_until, t_else, t_endif]) do
if is_token(t_while) // consumes token when true
expr()
expect(t_do)
stmtseq()
expect(t_endwhile)
elseif is_token(t_repeat)
stmtseq()
expect(t_until)
expr()
elseif is_token(t_if)
expr()
expect(t_then)
stmtseq()
if is_token(t_else)
stmtseq()
endif
expect(t_endif)
else // no matching stmt found
error()
endif
endwhile
endsub
2) stmtseq loops until a token that isn't in stmt's first_set is
found:
sub stmtseq() {
do forever
if is_token(t_while) // consumes token when true
expr()
expect(t_do)
stmtseq()
expect(t_endwhile)
elseif is_token(t_repeat)
stmtseq()
expect(t_until)
expr()
elseif is_token(t_if)
expr()
expect(t_then)
stmtseq()
if is_token(t_else)
stmtseq()
endif
expect(t_endif)
else // no matching stmt found, so terminate loop
break
endif
enddo
endsub
3) Each statement searches for its own specific follow-set token:
sub stmtseq(end_tokens) {
while not (token in end_tokens) do
if is_token(t_while) // consumes token when true
expr()
expect(t_do)
stmtseq(t_endwhile)
accept(t_endwhile)
elseif is_token(t_repeat)
stmtseq(t_until)
accept(t_until)
expr()
elseif is_token(t_if)
expr()
expect(t_then)
stmtseq([t_else, t_endif])
if is_token(t_else)
stmtseq(t_endif)
endif
accept(t_endif)
else // no matching stmt found
error()
endif
endwhile
endsub
Which one of the above is best, and/or is there a better way, using a
hand-written parser?
Return to the
comp.compilers page.
Search the
comp.compilers archives again.