Re: Are there any good examples of FIRST () and FOLLOW ()?

jos@and.nl (Jos Horsmeier)
Mon, 22 Jun 1992 10:08:03 GMT

          From comp.compilers

Related articles
Are there any good examples of FIRST () and FOLLOW ()? cosc143u@rosie.uh.edu (1992-06-20)
Re: Are there any good examples of FIRST () and FOLLOW ()? jos@and.nl (1992-06-22)
Re: Are there any good examples of FIRST () and FOLLOW ()? andrewd@cs.adelaide.edu.au (1992-06-25)
Re: Are there any good examples of FIRST () and FOLLOW ()? andrewd@cs.adelaide.edu.au (1992-06-29)
| List of all articles for this month |
Newsgroups: comp.compilers
From: jos@and.nl (Jos Horsmeier)
Keywords: LL(1), question
Organization: AND Software BV Rotterdam
References: 92-06-100
Date: Mon, 22 Jun 1992 10:08:03 GMT



cosc143u@rosie.uh.edu (91F9774) writes:
|We are going over predictive parsers right now (we won't actually be using
|them for our project but have to know how to build them for the test). My
|book has two examples total on these subjects and I just can't seem to
|understand the FOLLOW () function. Does anyone out there have a good
|explanation of the FOLLOW () function and how it is used in building the
|predictive parse table?


|Example:
| S->AB | epsilon
| A->aA
| | a
| B->bB'
| B'->BS | eps.


Allow me to quote from:


Compiler Construction
An advanced course
Edited by F.L. Bauer and J. Eickel
ISBN 0-387-08046-0


Chapter 2B by M. Griffiths p. 66:


An immediate follower matrix should be produced during the pass over the
grammar that produced the immediate starter matrix. This matrix needs
three fields, as is shown by the following example:


A -> B C D | E f


Immediate deductions are:


- C follows B
- if C -> epsilon then D follows B
- D follows C
- f follows E


There is, however, a further problem. Consider a rule containing A:


X -> Y A Z


Z follows A. But if we replace A by B C D, we obtain X -> Y B C D Z and Z
also follows D. Thus, the fact thaD is the last symbol of A needs to be
kept, since all followers of A are followers of D (and if D can generate
epsilon, the same is true for C, and so on.) The followers matrix has the
following form:


        A B ... Z A B ... Z a b ... z
      +----------+----------+----------+
  A | | | |
  B | | | |
  . | | | |
  . | | | |
  Z | | | |
      +----------+----------+----------+


In the first field (X, Y) = 1 iff X follows Y
In the second field (X, Y) = 1 iff the followes of X are followers of Y,
that is to say that Y is the last memmber of a production of X.
In the last field (x, Y) = 1 means x follows Y.


For the first field, X follows Y means that all starters of X are
followers of Y; these starters can be found in the first function matrix.
Thus the corresponding values of the first function is added into the
third field of the followers matrix. Now perform a transitive closure on
the second and third field to obtain the complete followers matrix.


(end quote)


A similar `trick' can be performed to find the FIRST function of all
non-terminal symbols. The matrix looks like this:


        A B ... Z a b ... z
      +----------+----------+
  A | | |
  B | | |
  . | | |
  . | | |
  Z | | |
      +----------+----------+


Given a rule A -> B c D | e f


the cells [B, A] and [e, A] are set to 1. if B -> epsilon, then [c, A] is
also set to 1. Applying the closure over the second field gives us the
FIRST values for all the non-terminal symbols.


Note that any 1 value on the diagonal of the first field indicates
(indirect) left recursion over that particular non-terminal symbol.


I've implemented this once, using the Warshall closure algorithm. The
matrixes are represented by a simple bit matrix. Therefore I was able to
perform a closure step, using 32 rules (long ints) at the same time. I
think it is quite efficient. (ahem ;-)


Your example grammar results in the following FIRST and FOLLOW functions:


FIRST(S) = FIRST(A) = { a }
FIRST(B) = FIRST(B') = { b }


FOLLOW(S) = FOLLOW(B) = FOLLOW(B') = { a $ }
FOLLOW(A) = b


This causes a conflict in cell [A, a] because the rules for non-terminal
symbol A are not left-factored properly. There's also a conflict for the
rule S -> epsilon, when the current input is an a, but it is completely
benign when the rule S -> A B is used in cell [S, a]. The parse table has
the following values:


a b $
      +-----------+------------+-----------+
  S | S -> A B | | S -> eps |
  A | A -> a A | | |
  B | | B -> b B' | |
  B'| B' -> eps | B' -> B S | B' -> eps |
      +-----------+------------+-----------+


I hope this helps a bit ...


Jos aka jos@and.nl


AND Software bv, Westersingel 108, 3015 LD Rotterdam, The Netherlands
Phone: +31 (10) 436 7100, Fax: +31 (10) 436 7110
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.