Related articles |
---|
FIRST function computation problem. mefrill@yandex.ru (2003-06-22) |
Re: FIRST function computation problem. derkgwen@HotPOP.com (Derk Gwen) (2003-07-02) |
Re: FIRST function computation problem. torbenm@diku.dk (2003-07-02) |
Re: FIRST function computation problem. haberg@matematik.su.se (2003-07-02) |
Re: FIRST function computation problem. soenke.kannapinn@wincor-nixdorf.com (=?Windows-1252?Q?S=F6nke_Kannapinn?=) (2003-07-02) |
From: | torbenm@diku.dk (Torben Ęgidius Mogensen) |
Newsgroups: | comp.compilers |
Date: | 2 Jul 2003 00:38:49 -0400 |
Organization: | Department of Computer Science, University of Copenhagen |
References: | 03-06-103 |
Keywords: | parse |
Posted-Date: | 02 Jul 2003 00:38:49 EDT |
mefrill@yandex.ru (Vladimir) writes:
> Hi,
>
> I have a problem with FIRST function for 1 lookahead symbol evalation.
> To lookup such the symbols for some nonterminal A I am doning follows:
>
> 1. go through all rules that left part is A. Here I do following:
> a) If I have rule A --> aB, where a is terminal. I just add this
> terminal to lookahead set for A.
> b) If I have rule A --> e, where e is epsilon, I add e to lookahead
> set for A.
> c) If I have rule A --> B C ... where B - nonterminal, I at first
> compute the lookahead set for B, if B's lookahead set contains
> epsilon, check C and etc. I found that recursive function here is the
> best. But I do not know how to work with rules like follows: A --> B,
> B --> C A D... To compute lookahead set for A I have to at first
> compute lookaheads for B, but if lookaheads for C contain epsilon I
> need to know the same about A to understand do checking of D or
> not...
>
> Please, help to decide this problem and there may be the good FIRTS
> computaion algorithm?
In "The Dragon Book" and many other compiler books, the FIRST sets
include epsilon symbol. This has, however, a different meaning than
the other symbols in the FIRST sets: A terminal symbol in the FIRST
set of a nonterminal (or, more generally, a sequence of grammar
symbols) means that a string derived from the nonterminal (symbol
sequence) can start with that symbol. Epsilon, on the other hand.
Means that the nonterminal (symbol sequence) can derive the empty
string. Additionally, there is a type mismatch since epsilon isn't a
terminal symbol.
I much prefer the more modern approach of separating the FIRST
function into two functions: A "Nullable" function, that for each
nonterminal or sequence of grammar symbols determines if this can
derive the empty string, and a "First" function that maps a
nonterminal (or symbol sequence) to the set of terminal symbols that
can start strings derived from the nonterminal (or sequence).
You first find Nullable, which is solvable as a set of Horn-clause
constraints and then you find First, which is a set of set equations,
each of which may depend on the results of the Nullable calculation.
This approach is taken in, e.g., Andrew Appels "Modern Compiler
Implementation in <L>" (where <L> can be "ML", "Java" or "C").
You can find details of implementation in these books too.
Torben Mogensen
Return to the
comp.compilers page.
Search the
comp.compilers archives again.