Re: Closure and Goto

manu@sasi.com
12 Aug 1999 02:57:23 -0400

          From comp.compilers

Related articles
Closure and Goto thomas.strinnhed@swipnet.se (Thomas S. Strinnhed) (1999-08-04)
Re: Closure and Goto manu@sasi.com (1999-08-12)
| List of all articles for this month |

From: manu@sasi.com
Newsgroups: comp.compilers
Date: 12 Aug 1999 02:57:23 -0400
Organization: Compilers Central
Keywords: parse
Refereces: 99-08-029

> thomas.strinnhed@swipnet.se


>Theese functions are Closure and Goto for determining
>sets-of-items
>I've got some difficulties understanding them, and would much
>appreciate some explanation along with some examples




Closure of a set of items I in a grammar G is constructed by the
following two rules:


1. Every item in I is added to closure(I)


2. If there's an item
          A -> a.Bb in closure(I) and
          B -> g is a production in the grammar
              (where a, b, and g are strings of grammar symbols)
      then
            the item B -> .g is added to closure(I)


      If A -> a.Bb is in closure(I), it indicates that we've seen "a"
      and we expect to see some substring derivable by "Bb"


      Since B is a non-terminal, it also means we expect to see some
      substring derivable by "gb"
      That's why we add B -> .g to closure(I)




For example,


Consider the following augmented grammar:
(same as in the Dragon Book)


1. E' -> E
2. E -> E+T | T
3. T -> T*F | F
4. F -> (E) | id


[The augmented grammar is because the input is acceptable only
when the parser is about to reduce by E' -> E.]


If I is a set of one item containing { [E' -> .E] } then
closure(I) contains:


      {
          [E' -> .E], // by rule 1


          [E -> .E+T], // by rule 2 on the previous item (use production 2 of the grammar)
          [E -> .T],


          [T -> .T*F], // by rule 2 on the previous item (use production 3)
          [T -> .F],


          [F -> .(E)], // by rule 2 on the previous item (use production 4)
          [F -> .id]
      }






Goto(I, X)
where I is a set of items and X is a grammar symbol is:
the closure of all items
[A -> aX.b] such that
[A -> a.Xb] is in I


(where "a" & "b" are strings of grammar symbols)




For example,
if I is the set of two items
          {
              [E' -> E.],
              [E -> E .+ T]
          }
then goto(I, +) is, by definition ,


    closure of { [E -> E + .T] }


that is we examine all items in I which have a "+" immediately to
the right of a "." (in our example, the second item),
move the dot over the "+" to get E -> E + .T
and find its closure


closure of { [E -> E + .T] } is
        {
            [E -> E + .T], // by rule 1 to find closure


            [T -> .T*F], // by rule 2 on the previous item (production 3)
            [T -> .F],


            [F -> .(E)], // by rule 2 on the previous item (production 4)
            [F -> .id]
        }








The complete set of LR(0) items for the grammar is given below:
(reproduced from the Dragon book & paraphrased)


I0 = closure({ [E' -> .E] }) // since E' is the start symbol of
                                                          // the augmented grammar
      = {
              E' -> .E
              E -> .E + T
              E -> .T
              T -> .T * F
              T -> .F
              F -> .(E)
              F -> .id
            }


I1 = goto(I0, E)
      = {
              E' -> E. // from I0; note, however, that closure of these
              E -> E. + T // two items don't yield anything
          }


I2 = goto(I0, T)
      = {
              E -> T. // from I0; their closure yields nothing
              T -> T. * F
          }




I3 = goto(I0, F)
      = {
              T -> F. // closure yields nothing
          }


I4 = goto(I0, ')')
      = {
              F -> (.E) // from I0


              E -> .E + T // closure of the previous item
              E -> .T
              T -> .T * F
              T -> .F
              F -> .(E)
              F -> .id
          }


I5 = goto(I0, id)
      = {
              F -> id.
          }


// we ran out of grammar symbols in I0
// note that we cannot have goto(I0, *) because there's no item
// in I0 which has "*" immediately to the right of a "."




I6 = goto(I1, +)
      = {
              E -> E +. T // from I1


              T -> .T * F // closure of the previous item
              T -> .F
              F -> .(E)
              F -> .id
          }


// we cannot have goto(I1, E) and goto(I1, T)


I7 = goto(I2, *)
      = {
              T -> T *. F // from I2


              F -> .(E) // closure of the previous item
              F -> .id
          }


// we cannot have goto(I2, T) and goto(I2, F)
// neither can we have goto(I3, F)




I8 = goto(I4, E)
      = {
              F -> (E.)
              E -> E. + T // their closure yields nothing
          }


// note that goto(I4, T) = I2 and goto(I4, F) = I3
// we cannot have any more gotos in I4 and I5


I9 = goto(I6, T)
      = {
              E -> E + T.
              T -> T. * F // their closure yields nothing
          }


// no more gotos in I6


I10 = goto(I7, F)
        = {
                T -> T * F.
            }


// no more gotos in I7


I11 = goto(I8, ')')
        = {
                F -> (E).
            }


// no more gotos in I9, I10 and I11


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.