Re: Java Comment-Preserving Grammar

Chris F Clark <>
11 Jun 2004 02:56:42 -0400

          From comp.compilers

Related articles
[3 earlier articles]
Re: Java Comment-Preserving Grammar (Chris Dodd) (2004-05-30)
Re: Java Comment-Preserving Grammar (Dobes Vandermeer) (2004-05-30)
Re: Java Comment-Preserving Grammar (Tim Bauer) (2004-05-30)
Re: Java Comment-Preserving Grammar (2004-06-06)
Re: Java Comment-Preserving Grammar (Clint Olsen) (2004-06-06)
Re: Java Comment-Preserving Grammar (glen herrmannsfeldt) (2004-06-09)
Re: Java Comment-Preserving Grammar (Chris F Clark) (2004-06-11)
Re: Java Comment-Preserving Grammar (Alex Colvin) (2004-06-13)
Re: Java Comment-Preserving Grammar (glen herrmannsfeldt) (2004-06-13)
Re: Java Comment-Preserving Grammar (Chris F Clark) (2004-06-15)
Re: PL/I syntax, was Java Comment-Preserving Grammar (Alex Colvin) (2004-06-21)
Re: PL/I syntax, was Java Comment-Preserving Grammar (Peter Flass) (2004-06-26)
| List of all articles for this month |

From: Chris F Clark <>
Newsgroups: comp.compilers
Date: 11 Jun 2004 02:56:42 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 04-05-075 04-06-004 04-06-022
Keywords: Java, parse
Posted-Date: 11 Jun 2004 02:56:42 EDT

Jens Troeger wrote:

> As Far As I Know, A Grammar Has Nothing To Do With Comments. The
> Scanner Usually Skips Over Comments In The Source Code Already,
> Such That The Parser Doesn'T Even See Them. However, You Can Write
> Your Own Scanner And Parser, That Make Comments Explicit In The Grammer.
> But If You Want To Allow Comments _Anywhere_ In Yout Language, The
> Good Luck!! :-) That Grammar Is Going To Be A Mess....

Glen Herrmannsfeldt replied:
> Would it be that hard? Most languages that I know
> of allow comments in places where blank space is allowed.
> I suppose in most cases blanks are removed by the lexer, and so not
> included in the grammar, though I don't think it would be all that
> hard to add them in in the places they are allowed. It might not
> look so nice, though.

Not look so nice = hard.

Most grammars ignore all spacing and comment issues because they make
the grammar not look so nice (i.e. they obscure the real intent of the
grammar) and because the simplest forms of LL and LR parsing have
trouble parsing grammars that don't have all that "extraneous cruft"

In particular, each place a whitespace or comment token is allowed, it
is almost always optional. Unless your parser generator has been
extended to support regular expressions (and standard Yacc/Bison is
not, and prior to about 1985 there were no commonly available LL+RE
tools and about 1990 no LR+RE tools), you can't express such optinal
clauses "inline". And, if you can't do them inline, then you need a
"nullable" rule to express them. And, nullable rules tend to
introduce additional conflicts into the language.

Thus, prior to 1985, it was not only inconvenient to introduce
whitespace into your language, but it was likely to cause your parser
generator to fall over.

Around 1990 the situation improved dramatically. Not only were tools
introduced the integrated regular expressions into both the LL and LR
parsing methodology, which meant that it was possible to introduce the
whitspace rules into your grammar without breaking the generator that
processed it. The same tools started introducing special notations to
handle the whitespace problem explicitly. Over the years since 1990,
I have seen several advancements in that notation.

Of course, if one still uses garden variety Yacc/Bison, one is still
stuck in the pre-1990 era, but other tools like PCCTS, JavaCC, Yacc++,
and Meta-S all have some solution for the problem.

Hope this helps,

Chris Clark Internet :
Compiler Resources, Inc. Web Site :
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.