Re: some basic questions about parsing (Daniel J. Salomon)
16 Sep 1996 14:34:33 -0400

          From comp.compilers

Related articles
some basic questions about parsing (Michael Herstine) (1996-09-15)
Re: some basic questions about parsing (1996-09-16)
Re: some basic questions about parsing (1996-09-16)
| List of all articles for this month |

From: (Daniel J. Salomon)
Newsgroups: comp.compilers
Date: 16 Sep 1996 14:34:33 -0400
Organization: Computer Science, University of Manitoba, Winnipeg, Canada
References: 96-09-053
Keywords: parse

Michael Herstine <> wrote:
|> I'm reading Jim Holme's "Object Oriented Compiler Construction".

This is not a good book for learning parsing theory. It basically passes
on the task to lex and yacc. The book seems to have been rushed into print
to capitalize on the craze for anything object-oriented.

For anyone else curious about this book, it discusses how to write a
Pascal compiler in an object-oriented style; it does not discuss how to
implement object-oriented languages. It is not aimed at compiler
professionals. It is intended to be a simple text for a college
course, but I found that many of the descriptions would be unclear to
novices. His discussion of how to write a hand generated scanner is
especially bad. After fooling around with some finite-state-machine
theory he just gives up saying it is too complicated to write hand coded

|> I have several questions:
|> 1. If we seek the most powerful deterministic automaton possible,
|> wouldn't that be a Turing machine? Since they can recognize unrestricted
|> phrase structure grammars, a much larger set of grammars than context
|> free, why don't we use them as parsers?

There are currently no efficient and reliable ways to turn phrase-
structure grammars into efficient parsers. And even if you could build
one, you could not, in the general case, prove that it would ever
terminate parsing. In the book "Parsing Techniques: A Practical Guide"
by Dick Grune and Ceriel Jacobs, the authors also argue that phrase
structure grammars are not particularly easy for humans to understand
either. Although you can build phrase structure grammars that
recognize extraordinarily complex languages, you would have a very hard
time explaining to anyone exactly how they work. It seems to be easier
to design a language that can be easily recognized by computers, and in
the end, grammars for such a language are also more easily understood
by humans. The book by Grune and Jacobs, by the way, provides a
excellent overview of parsing theory for both novices and experts.

|> 2. What sort of grammars generate LR(1) languages?

There are complex tests that you can apply to see if a grammar is LR(1),
but in general, you just try to apply an LR(1) parser-generation
algorithm and if it works then the grammar is LR(1). This may sound
unsatisfying, but remember that it is not even decidable in the general
case whether a context-free grammar is deterministic.

|> 3. The production rules that he writes down for Pascal are clearly
|> context free, but seem to include rules that allow non-terminals to go to
|> the null string - doesn't that put them in the class of unrestricted
|> phrase structure grammars (they're not allowed in context free grammars,
|> are they)?

They are allowed in CFGs. You may be thinking of context-sensitive
grammars which, in some definitions, do not allow epsilon productions
at all.

Daniel J. Salomon -- salomon@cs.UManitoba.CA
              Dept. of Computer Science / University of Manitoba
              Winnipeg, Manitoba, Canada R3T 2N2 / (204) 474-8687

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.