# Re: Anyone know this book?

## William D Clinger <will@ccs.neu.edu>13 Mar 1998 00:05:08 -0500

From comp.compilers

Related articles
Anyone know this book? fpeelo@portablesolutions.com (Frank Peelo) (1998-03-03)
Re: Anyone know this book? dwight@pentasoft.com (1998-03-06)
Re: Anyone know this book? will@ccs.neu.edu (William D Clinger) (1998-03-13)
| List of all articles for this month |

 From: William D Clinger Newsgroups: comp.compilers Date: 13 Mar 1998 00:05:08 -0500 Organization: Northeastern University References: 98-03-019 Keywords: books, parse

Frank Peelo wrote:
> I bought myself a copy of "Compiler Construction" by N. Wirth...

I have a first printing of that book, copyright 1996, but I couldn't
find the diagram you drew, so maybe you have a newer edition. Your
diagram resembles part of Figure 4.1, though.

> For example, in one section he discusses generating a "table" to
> represent a grammar. A construct like
> p = t { '+' | '-' t }.
> gets represented as
> +-----------------+
> v |
> p --> [ ] --> [ '+' ] --|
> ----/ | |
> / [ '-' ] --+
> t |
> [ EMP ]

In Wirth's EBNF, I think the production for p that yields this
syntax diagram would be written

p = t { ("+" | "-") t }

The { ... } notation means zero or more repetitions of the stuff
inside the curly braces; in other words the production above is
equivalent to

p = t q
q = <empty> | sign t q
sign = "+" | "-"

where <empty> stands for the empty sequence of tokens.

> but what I don't understand is: why the pseudo-terminal symbol "EMP"
> (indicating a lack of an alternative terminal symbol) is needed.

EMP serves the same purpose as <empty> in the productions above:
A p consists of a t followed by a q, where a q consists either of
nothing at all, or a sign followed by a t followed by a q. If we
didn't allow for the possibility that a q could be nothing at all,
we would have the grammar

p = t q
q = sign t q
sign = "+" | "-"

which doesn't generate any terminal strings at all; no matter how
hard you try, you can't get rid of the nonterminal q.

It is always possible to get rid of these <empty> productions
(except possibly one for the start symbol) by rewriting the
grammar, but this rewriting usually makes the grammar more
complicated and harder to deal with.

William D Clinger
--

Post a followup to this message