12 Mar 2006 13:55:57 -0500

Related articles |
---|

computing the first set (concretely) aegis@mad.scientist.com (aegis) (2006-03-11) |

Re: computing the first set (concretely) DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2006-03-12) |

Re: computing the first set (concretely) wyrmwif@tsoft.org (SM Ryan) (2006-03-12) |

From: | SM Ryan <wyrmwif@tsoft.org> |

Newsgroups: | comp.compilers |

Date: | 12 Mar 2006 13:55:57 -0500 |

Organization: | Quick STOP Groceries |

References: | 06-03-031 |

Keywords: | parse |

"aegis" <aegis@mad.scientist.com> wrote:

# I know how to compute the first set however I am conflicted on how I

# should represent the syntactical specification of my language in order

# to compute the first set programmatically.

#

# I was thinking along the lines of a two dimensional matrix by where

# the rows are indexed by a particular nonterminal to find all terminals

# for the given production.

#

# It would look something like:

#

# int table[X][Y] = { { ... } ,

# { expr, '+', digit },

# { 0, 1, 2, 3, 4, 5 },

# { ... }

# };

#

# and where my specification would look like:

#

# ...

# expr: expr '+' digit ;

# digit: 0 | 1 | 2 | 3 | 4 | 5 ;

E: S | E + S

S: SIGN P

SIGN: + | - |

P: a | ( E )

(1) Create a directed graph p => q if production p can produce production q.

E => S, E => P, E => E, E => SIGN

S => S, S => P, S => E, S => SIGN

P => S, P => P, P => E, P => SIGN

SIGN

(2) Find strongly connected components and reduce the above graph to a DAG

{E,S,P} => {SIGN}

{SIGN}

(3) You can now easily compute NULLABLE(X) = X can produce an empty string

Start from the bottom of the reduced DAG and examine all alternatives

of all the productions in that SCC. Ignore alternatives that include

a production in the same SCC; if any other alternative is empty or

consists solely of nullable production, all productions in the current

SCC are nullable.

NULLABLE({SIGN}) = NULLABLE(+) | NULLABLE(-) | NULLABLE()

= true

NULLABLE({E,S,P}) = NULLABLE(a) -- the only nonrecursive alternative

= false

(4) Compute WEIGHT(X) = length of minimum nonempty terminal expansion of X,

or 0 if X is always empty.

WEIGHT(SIGN) = min(WEIGHT(+),WEIGHT(-)) = 1

Again use the SCCs of the => graph to identify recursive alternatives.

WEIGHT(P) = min(WEIGHT(a),WEIGHT((E)))

The (E) alternative is recursive because E and P

are in the same SCC of the => graph. And because

not NULLABLE(P), we know the recursive alternative

cannot be any lighter than the nonrecursive alternative.

= min(WEIGHT(a)) = 1

WEIGHT(S) = min(WEIGHT(SIGN P))

When a member of an alternative is NULLABLE, it might

have an empty expansion while the alternative as a

whole is not empty. Bifurcate the alternative for

nullable member, including and excluding the nullable

members: S: SIGN P | P

= min(WEIGHT(SIGN)+WEIGHT(P),WEIGHT(P))

= min(1+1,1) = 1

WEIGHT(E) = min(WEIGHT(S),WEIGHT(E+S))

= min(WEIGHT(S)) = 1

(5) To compute FIRSTk(s), the first k terminals of string s,

Construct the sets F1 = {}, G1 = {PREFIXk(s)}

The PREFIXk(s) is the minimal prefix of s such that

WEIGHT(PREFIXk(s))>=k, or PREFIXk(s) = s

E: S | E + S

PREFIX2(S) = S

PREFIX2(E+S) = E+

S: SIGN P

Because NULLABLE(SIGN), again bifurcate

for nullable members

PREFIX2(SIGN P) = SIGN P | P

SIGN: + | - |

PREFIX2(+) = +

PREFIX2(-) = -

P: a | ( E )

PREFIX2(a) = a

PREFIX2((E)) = (E

Now having Fi and Gi, construct F[i+1] and G[i+1] by

If Gi is empty, Fi = FIRSTk(s) and stop.

Otherwise choose any string s in Gi.

If s is a terminal string,

F[i+1] = F[i] + {s}

G[i+1] = G[i] - {s}

If s=uXv, u is empty or terminals only,

X is the production X: x1|x2|....|xn

Let zr = PREFIXk(u xr v)

F[i+1] = F[i]

G[i+1] = G[i] + {zr: zr is not in any Gj, j<=i}

FIRST2(E)

1 F1 = {}; G1 = {PREFIX2(E)} = {E}

2 Take E from G1.

E: S | E+S

PREFIX2(S) = S

PREFIX2(E+S) = E+

G2 = {S,E+}

3 Take E+ from G2

E: S | E+S

PREFIX2(S+S) = S+

PREFIX2(E+S) = E+, already in G2

G3 = {S,S+}

4 Take S from G3

S: SIGN P

PREFIX2(SIGN P) = SIGN P | P

G4 = {SIGN P,P,S+}

5 take SIGN P from G4

SIGN: + | - |

PREFIX2(+P) = +P

PREFIX2(-P) = -P

G5 = {+P,-P,P,S+}

6 take S+ from G5

S: SIGN P

PREFIX2(SIGN P+) = SIGN P, already in G4

PREFIX2(P+) = P+

G6 = {+P,-P,P,P+}

7 take +P from G6

P: a | (E)

PREFIX2(+a) = +a

PREFIX2(+(E)) = +(

G7 = {+a,+(,-P,P,P+}

8 F8 = {+a}

G8 = {+(,-P,P,P+}

9 F9 = {+a,+(}

G9 = {-P,P,P+}

10 take -P from G9

P: a | (E)

PREFIX2(-a) = -a

PREFIX2(-(E)) = -(

G10 = {-a,-(,P,P+}

11 F11 = {+a,+(,-a}

G11 = {-(,P,P+}

12 F12 = {+a,+(,-a,-(}

G12 = {P,P+}

13 take P from G12

P: a | (E)

PREFIX2(a) = a

PREFIX2((E)) = (E

G13 = {a,(E,P+}

14 F14 = {+a,+(,-a,-(,a}

G14 = {(E,P+}

15 take P+ from G14

P: a | (E)

PREFIX2(a+) = a

PREFIX2((E)+) = (E, already in G14

G15 = {(E,a+}

16 take (E from G15

E : S | E+S

PREFIX2((S) = (S

PREFIX2((E+S) = (E, already in G14

G16 = {(S,a+}

17 take (S from G16

S : SIGN P

PREFIX2((SIGN P) = (SIGN | (P

G17 = {(SIGN,(P,a+}

18 take (SIGN P from G17

SIGN: +|-|

PREFIX2((+P) = (+

PREFIX2((-P) = (-

G18 = {(+,(-,(P,a+}

19 F19 = {+a,+(,-a,-(,a,(+}

G19 = {(-,(P,a+}

20 F20 = {+a,+(,-a,-(,a,(+,(-}

G20 = {(P,a+}

21 F21 = {+a,+(,-a,-(,a,(+,(-,a+}

G21 = {(P}

22 take (P from G21

P: a | (E)

PREFIX2((a) = (a

PREFIX2(((E)) = ((

G22 = {(a,((}

23 F23 = {+a,+(,-a,-(,a,(+,(-,a+,(a}

G23 = {((}

24 F24 = {+a,+(,-a,-(,a,(+,(-,a+,(a,((}

G24 = {}

25 stop

FIRST2(E) = +a | +( | -a | -( | a | (+ | (- | (a | (( | a+

--

SM Ryan http://www.rawbw.com/~wyrmwif/

Title does not dictate behaviour.

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.