Related articles |
---|
Philosophical question regarding statement terminators steve@brazzell.com (Steve Brazzell) (2000-11-07) |
Re: Philosophical question regarding statement terminators tmoog@polhode.com (Tom Moog) (2000-11-09) |
Re: Philosophical question regarding statement terminators cfc@world.std.com (Chris F Clark) (2000-11-09) |
Re: Philosophical question regarding statement terminators jthorn@galileo.thp.univie.ac.at (2000-11-09) |
Re: Philosophical question regarding statement terminators vbdis@aol.com (2000-11-11) |
Re: Philosophical question regarding statement terminators wclodius@aol.com (2000-11-14) |
Re: Philosophical question regarding statement terminators cfc@world.std.com (Chris F Clark) (2000-11-14) |
Re: Philosophical question regarding statement terminators jerrold.leichter@smarts.com (Jerry Leichter) (2000-11-14) |
Re: Philosophical question regarding statement terminators cfc@world.std.com (Chris F Clark) (2000-11-15) |
[4 later articles] |
From: | Chris F Clark <cfc@world.std.com> |
Newsgroups: | comp.compilers |
Date: | 9 Nov 2000 12:09:40 -0500 |
Organization: | Compilers Central |
References: | 00-11-058 |
Keywords: | syntax, design |
Posted-Date: | 09 Nov 2000 12:09:40 EST |
CC: | compilers@iecc.com, cfc@world.std.com |
This is not a philosophical answer, but a technical answer about your
three rules:
> I assume (perhaps incorrectly) that all languages that allow statements to
> span more than one line must meet one of the following requirements:
>
> 1) Statement terminators required.
> 2) Line continuation indicator required.
> 3) Completely unambiguous grammar without 1) or 2)
To have a language without statement terminators (or statement
separators) and without line continuations and still having an
unambiguous grammar, one must have distinct statement starting tokens
that can be recognized as starting a new statement (rather than
continuing the list). Although, if the language does not have to be
LR(1) or LL(1), you can put the distinct statement starting tokens one
or two tokens into the rule.
One very obvious example of this is the traditional grammar for yacc.
The language is LR(2) (and is often implemented by hackery that fudges
the two tokens of required lookahead into one token). The key
features of the rules are shown below:
rules: rule*; // a yacc grammar has a series of rules
rule: id ":" id* ("|" id*)* ";"?;
// each rule begins with an id followed by a colon
// and ends with a list of ids (usually followed by a semi)
It is the optional semi-colon (statement terminator) that causes the
grammar to be LR(2). If you leave out the semis, you can still tell
one rule from the next by finding the colons. However, when you've
found the colon, you have gone one token too far (thus 2 tokens of
lookahead are required).
Interestingly, with most old dialects of BASIC, you could do the same
thing (removing both line numbers and end-of-lines from the grammar).
All statements began with a keyword with the notable exception of
"let" which was optional, but which had a required "=" after the
target variable. Thus, yacc is not merely an aberration.
Hope this helps,
-Chris
*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
3 Proctor Street voice : (508) 435-5016
Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)
Return to the
comp.compilers page.
Search the
comp.compilers archives again.