Related articles |
---|
Grammar analysis vbdis@aol.com (2004-11-20) |
Re: Grammar analysis lhp+news@toft-hp.dk (Lasse =?ISO-8859-1?Q?Hiller=F8e?= Petersen) (2004-11-26) |
Re: Grammar analysis demakov@ispras.ru (Alexey Demakov) (2004-11-28) |
Re: Grammar analysis vbdis@aol.com (2004-11-29) |
Re: Grammar analysis lhp+news@toft-hp.dk (Lasse =?ISO-8859-1?Q?Hiller=F8e?= Petersen) (2004-12-01) |
Re: Grammar analysis vbdis@aol.com (2004-12-01) |
From: | "Lasse =?ISO-8859-1?Q?Hiller=F8e?= Petersen" <lhp+news@toft-hp.dk> |
Newsgroups: | comp.compilers |
Date: | 26 Nov 2004 22:46:20 -0500 |
Organization: | No Organization |
References: | 04-11-090 |
Keywords: | parse |
Posted-Date: | 26 Nov 2004 22:46:20 EST |
vbdis@aol.com (VBDis) wrote:
> I'm just about to implement an grammar analyzer, and wonder about some
> of the algorithms described in the according literature.
>
> The computation of the First and Follow sets is not trivial,
I hope you meant trivial, I managed to write some code that seems to
work, so it must be! ;-)
> nonetheless I don't understand the "repeat until nothing changes"
> loops. Are these abbreviations of otherwise higher order algorithms,
> or did I miss newer and more straight algorithms?
>
> DoDi
Perhaps I may be allowed to answer by posting a short code example from
my LL(1) parser in the works, and add a question of my own. Please note
that I have used suggestive macros, this is code that works, but I omit
the definitions and declaration in hope that the algorithm itself is
clear enough. Here's the code:
void discover_first() {
int s;
Prod *p;
/* compute FIRST sets for all symbols */
/* for all terminal s: FIRST(s) = { s } */
for(EACH_SYMBOL(s)) {
FIRST(s) = bits_empty();
if(TERMINAL(s)) bits_set1(&FIRST(s),s);
}
changed:
for(EACH_PRODUCTION(s,p)) {
/* The s: p rule can be one of two variants:
s: p1 p2 .. pj where all pi are nullable
or
s: p1 p2 .. pj r1 r2 ... where all pi except pj are nullable
FIRST(s) is union of FIRST(pi) for all i
*/
int j;
bits_t newfirst;
newfirst = FIRST(s);
for(j = 0 ; j < p->count && NULLABLE(p->seq[j]) ; j++) {
newfirst = bits_union(newfirst,FIRST(p->seq[j]));
}
if( j < p->count )
newfirst = bits_union(newfirst,FIRST(p->seq[j]));
if( ! bits_equal(newfirst, FIRST(s) ) ) {
FIRST(s) = newfirst;
goto changed;
}
}
/* now decorate each production with first sets
to be used for selection decision */
/* code for this deleted for sake of brevity */
}
As can be seen, each time newfirst has something added to it, the loop
restarts with a simple goto. So each time a production's FIRST changes,
the iteration over productions is restarted.
Another approach would be to just note the change in place of the goto:
{something_changed = 1;}
and wrap for(EACH_PRODUCTION(s,p)) with:
something_changed = 1;
while(something_changed) { something_changed = 0; ... }
this would finish the iteration over all productions before restarting
with the first production. In both cases it repeats until nothing
changes.
I don't think it matters much; but still I'd like to ask more
knowledgeable people: is there _any_ _significant_ difference in
efficiency between the two approaches?
I think I've seen references to "efficient" computation of FIRST and
FOLLOW; but given the simplicity of the approach above, when would such
efficient algorithms be worthwhile?
-Lasse
Return to the
comp.compilers page.
Search the
comp.compilers archives again.