Sun, 28 Oct 90 13:46:06 PST

Related articles |
---|

[10 earlier articles] |

Re: Can Pascal be parsed by LR(1) parsing algorithm? firth@sei.cmu.edu (1990-10-17) |

Re: Can Pascal be parsed by LR(1) parsing algorithm? firth@sei.cmu.edu (1990-10-18) |

Re: Can Pascal be parsed by LR(1) parsing algorithm? djones@megatest.uucp (1990-10-21) |

Re: Can Pascal be parsed by LR(1) parsing algorithm? crocker@Alliant.COM (1990-10-23) |

Re: Can Pascal be parsed by LR(1) parsing algorithm? piet@cs.ruu.nl (1990-10-26) |

Re: Can Pascal be parsed by LR(1) parsing algorithm? andy@Theory.Stanford.EDU (1990-10-26) |

Re: Can Pascal be parsed by LR(1) parsing algorithm? jas@Ingres.COM (1990-10-28) |

Re: Can Pascal be parsed by LR(1) parsing algorithm? firth@sei.cmu.edu (1990-11-05) |

Newsgroups: | comp.compilers |

From: | jas@Ingres.COM (Jim Shankland) |

Keywords: | Pascal, LL(1) |

Organization: | The Eddie Group |

References: | <9112@fy.sei.cmu.edu) <9010232339.AA20860@Alliant.COM> <1990Oct26.220803.1411@Neon.Stanford.EDU> |

Date: | Sun, 28 Oct 90 13:46:06 PST |

In article <1990Oct26.220803.1411@Neon.Stanford.EDU> andy@Theory.Stanford.EDU (Andy Freeman) writes:

*>In article <9010232339.AA20860@Alliant.COM> crocker@Alliant.COM (Ben Crocker) writes:*

*>>Having written a Pascal compiler with an LL(1) parser generator, I can*

*>>vouch for the proposition that Pascal is LL(1).*

*>*

*>Such compilers are built on tokenizers with 2 character look-ahead. Remember*

*>that "1..5" has the same tokens as "1 .. 5", but requires 2 character*

*>look-ahead to distinguish from streams containing "1.<digit>".*

*>*

*>Look-ahead 2 tokenising feeding a Lx(1) parser does not demonstrate that*

*>Pascal is Lx(1); it demonstrates that a tokenized version of a language may*

*>have different look-ahead requirements than the stream-of-characters version.*

Such postings should best be made from a machine with a name other than

"theory.stanford.edu" :->.

For any language with an LR(n) grammar, n > 0, there is an LR(n -1)

grammar accepting the same language. In other words, a language is

either LR or not LR; it makes no sense to say that a language is LR(k),

for any particular k. There are practical considerations, in that

the LR(n) grammar for a language may be considerably smaller and more

comprehensible than the LR(n - 1) grammar. But that's all. The original

proof of this is due to Knuth; it can no doubt be found in many sources,

certainly in Hopcroft and Ullman. (I'm away from my references, so I

can't be more precise; sorry.)

As for LL languages, in <1990Oct26.213044.28243@Neon.Stanford.EDU>,

Sriram Sankar (sankar@neon.stanford.edu) asserts:

*>LL(n+1) languages are a strict superset of LL(n) languages;*

As I said, I'm away from my references, and I'm not really a theoretician,

but I believe this, too, to be incorrect. I think that if there is

an LL(n + 1) grammar for a language, then there will also be some

LL(n) grammar for the language (n >= 0). Consider the following proof

sketch showing that an LL(2) grammar can be rewritten into an LL(1)

grammar (the generalization should be straightforward):

If a grammar G is LL(2), but not LL(1), then there must be productions

of the form

A --> a alpha

B --> a beta

such that both productions could occur from the same parser state. (A and B

might be the same non-terminal.) Intuitively, the parser cannot "know" which

production is the correct one with only the "a" as lookahead.

Now rewrite the grammar as follows: remove those two productions and

replace them with

A' --> alpha

B' --> beta

where A' and B' are new non-terminals not used elsewhere; also,

for any production P having an A or a B in its right-hand side,

add a new production P' that is just like P, except that every

occurrence of A on the rhs has been replaced by "a A'"; similarly

with B, substituting "a B'".

Intuitively, this defers the decision of which parse is correct by pushing

the a and encoding the fact that an a was read in the parser state.

Finally, as to whether Pascal is or is not LL(anything): I will ship a case

of Anchor Steam beer to anyone who can show that Pascal is even context-free

by writing a context-free grammar that will accept only correct Pascal

programs and reject all other strings, including strings that would be

correct Pascal programs except that an undeclared variable is used in the

program. No fair cheating by using a Turing machine (e.g., C code) to

maintain a symbol table.

jas

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.