Mon, 22 Jun 1992 10:08:03 GMT

Related articles |
---|

Are there any good examples of FIRST () and FOLLOW ()? cosc143u@rosie.uh.edu (1992-06-20) |

Re: Are there any good examples of FIRST () and FOLLOW ()? jos@and.nl (1992-06-22) |

Re: Are there any good examples of FIRST () and FOLLOW ()? andrewd@cs.adelaide.edu.au (1992-06-25) |

Re: Are there any good examples of FIRST () and FOLLOW ()? andrewd@cs.adelaide.edu.au (1992-06-29) |

Newsgroups: | comp.compilers |

From: | jos@and.nl (Jos Horsmeier) |

Keywords: | LL(1), question |

Organization: | AND Software BV Rotterdam |

References: | 92-06-100 |

Date: | Mon, 22 Jun 1992 10:08:03 GMT |

cosc143u@rosie.uh.edu (91F9774) writes:

|We are going over predictive parsers right now (we won't actually be using

|them for our project but have to know how to build them for the test). My

|book has two examples total on these subjects and I just can't seem to

|understand the FOLLOW () function. Does anyone out there have a good

|explanation of the FOLLOW () function and how it is used in building the

|predictive parse table?

|Example:

| S->AB | epsilon

| A->aA

| | a

| B->bB'

| B'->BS | eps.

Allow me to quote from:

Compiler Construction

An advanced course

Edited by F.L. Bauer and J. Eickel

ISBN 0-387-08046-0

Chapter 2B by M. Griffiths p. 66:

An immediate follower matrix should be produced during the pass over the

grammar that produced the immediate starter matrix. This matrix needs

three fields, as is shown by the following example:

A -> B C D | E f

Immediate deductions are:

- C follows B

- if C -> epsilon then D follows B

- D follows C

- f follows E

There is, however, a further problem. Consider a rule containing A:

X -> Y A Z

Z follows A. But if we replace A by B C D, we obtain X -> Y B C D Z and Z

also follows D. Thus, the fact thaD is the last symbol of A needs to be

kept, since all followers of A are followers of D (and if D can generate

epsilon, the same is true for C, and so on.) The followers matrix has the

following form:

A B ... Z A B ... Z a b ... z

+----------+----------+----------+

A | | | |

B | | | |

. | | | |

. | | | |

Z | | | |

+----------+----------+----------+

In the first field (X, Y) = 1 iff X follows Y

In the second field (X, Y) = 1 iff the followes of X are followers of Y,

that is to say that Y is the last memmber of a production of X.

In the last field (x, Y) = 1 means x follows Y.

For the first field, X follows Y means that all starters of X are

followers of Y; these starters can be found in the first function matrix.

Thus the corresponding values of the first function is added into the

third field of the followers matrix. Now perform a transitive closure on

the second and third field to obtain the complete followers matrix.

(end quote)

A similar `trick' can be performed to find the FIRST function of all

non-terminal symbols. The matrix looks like this:

A B ... Z a b ... z

+----------+----------+

A | | |

B | | |

. | | |

. | | |

Z | | |

+----------+----------+

Given a rule A -> B c D | e f

the cells [B, A] and [e, A] are set to 1. if B -> epsilon, then [c, A] is

also set to 1. Applying the closure over the second field gives us the

FIRST values for all the non-terminal symbols.

Note that any 1 value on the diagonal of the first field indicates

(indirect) left recursion over that particular non-terminal symbol.

I've implemented this once, using the Warshall closure algorithm. The

matrixes are represented by a simple bit matrix. Therefore I was able to

perform a closure step, using 32 rules (long ints) at the same time. I

think it is quite efficient. (ahem ;-)

Your example grammar results in the following FIRST and FOLLOW functions:

FIRST(S) = FIRST(A) = { a }

FIRST(B) = FIRST(B') = { b }

FOLLOW(S) = FOLLOW(B) = FOLLOW(B') = { a $ }

FOLLOW(A) = b

This causes a conflict in cell [A, a] because the rules for non-terminal

symbol A are not left-factored properly. There's also a conflict for the

rule S -> epsilon, when the current input is an a, but it is completely

benign when the rule S -> A B is used in cell [S, a]. The parse table has

the following values:

a b $

+-----------+------------+-----------+

S | S -> A B | | S -> eps |

A | A -> a A | | |

B | | B -> b B' | |

B'| B' -> eps | B' -> B S | B' -> eps |

+-----------+------------+-----------+

I hope this helps a bit ...

Jos aka jos@and.nl

AND Software bv, Westersingel 108, 3015 LD Rotterdam, The Netherlands

Phone: +31 (10) 436 7100, Fax: +31 (10) 436 7110

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.