Re: Interface (Grammar precedence/notation) question

Hans-Peter Diettrich <DrDiettrich1@aol.com>
29 Dec 2006 22:58:31 -0500

          From comp.compilers

Related articles
Interface (Grammar precedence/notation) question cfc@shell01.TheWorld.com (Chris F Clark) (2006-12-28)
Re: Interface (Grammar precedence/notation) question DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-12-29)
Re: Interface (Grammar precedence/notation) question haberg@math.su.se (2006-12-29)
Re: Interface (Grammar precedence/notation) question cfc@shell01.TheWorld.com (Chris F Clark) (2007-01-05)
| List of all articles for this month |

From: Hans-Peter Diettrich <DrDiettrich1@aol.com>
Newsgroups: comp.compilers
Date: 29 Dec 2006 22:58:31 -0500
Organization: Compilers Central
References: 06-12-108
Keywords: design
Posted-Date: 29 Dec 2006 22:58:31 EST

Chris F Clark wrote:


> I am in the process of considering a new feature for a new version of
> Yacc++


I'm not familiar with Yacc nor Yacc++, so my following remarks may be of
little help.




> The question is, will this reversing of the precedence of and and or
> be a bad interface idea?


Yes, definitely.
I'd suggest a different wording, at least.




> family inside_grammar: LL(1);
> family outside_grammar: LALR(1);
>
> Now, note that because the inside_grammar is embedded inside the
> outside_grammar, it must be LALR(1), because otherwise the checks on
> the outside grammar will fail.


Some kind of inheritance?


> If we wanted to relax that constraint, we would write:
>
> family inside_grammar: LL(1);
> family outside_grammar: MOSTLY LALR(1);
>
> That would mean the parser generator would start checking
> outside_grammar using the LALR(1) rules, but when it came to
> inside_grammar, since it was marked LL(1), it would apply the LL(1)
> checks instead.


I wonder what's the purpose of the grammar classification.
Shall it describe the properties (and syntax?) of existing grammars, or
shall it indicate what grammar class is acceptable for the intended parser?


In the latter case, what's the purpose of such restrictions?


As indicated above, wouldn't it be sufficient to specify an
compatibility tree (or graph) for the grammar types? Then the parser
type can be determined from the given grammars, and it can be restricted
to some "level" in the tree. This would lead to a distinction between
the grammar and parser types, where grammar types describe the given
grammars, and the parser type describes the desired parser.




> family grammar: mostly ll;
> family expression, if_statement: lr;
>
> The nice thing about specifying something that way, is that one gets
> assurances that the vast majority of the language is easy to parse, as
> it is LL, but one has the freedom to use the more powerful LR parsing
> algorithm in the places where one needs it, so that one doesn't have
> to create all those nit-picky non-terminals to encode the precedence
> hierarchy.


IMO precedence and binding can be added to LL grammars, in the same way
as they already can be added to LR grammars. It's a matter of the
implementation, whether such an annotated LL grammar will be
auto-expanded into a traditional LL grammar, internally, or whether it
will be implemented immediatly in a specialized rule (subparser) class.


I'd also favor an optional disambiguation by sequential application of
alternatives (in LL parsers), turning LL(1) errors into non-fatal
warnings. The resulting code would look like:
      if (token in FirstOfAlt1) Alt1()
      else if (token in FirstOfAlt2) Alt2()
      ...
instead of the traditional
      case (token) of {...}




> What should I call LL grammars that depend on the hack that allow them
> to deal with the if-then-else ambiguity?


Also the implied handling of the dangling else, by a "longest match"
attribute, IMO is applicable to every grammar type?




> Technically, such grammars aren't actually LL. However, since most LL
> parser generators can deal with them, it seems wrong to not allow one
> to express that a grammar fits in the class the such parser generators
> can handle. I'm thinking of calling them "near LL". However, if there
> is an established term for them (or if near LL is used for something
> else), I'm open to other names.


I'd suggest ELL, according to EBNF ;-)




> And, now to why I want the precedence reversed. Suppose one wants a
> gramar where any fragment should either be LL or LALR, but the top set
> of rules should be LALR. It seems natural to write it as thus.
>
> family grammar: mostly lalr and lalr or ll;
> or
> family grammar: lalr or ll and mostly lalr;
>
> Instead of requiring parenthesis as in:
>
> family grammar: mostly lalr and (lalr or ll);


What about:
      family grammar: mostly lalr, lalr or ll;
where IMO "mostly" describes the parser type, the remainder the grammar
types.


> I guess if I implicitly "and" multiple family rules, then one could
> write it as:
>
> family grammar: lalr or ll; // any part of this grammar must be lalr or ll


Why that? What's the purpose or the consequences of according
(mis-)behaviour?


> family grammar: mostly lalr; // the top rule(s) must be lalr


Here:
      family parser: lalr;




As a general note, you should make clear what's the purpose of the
presented syntax, before going into details. I'm missing indications,
when e.g. grammar syntax, checks or extensions are addressed, and where
the parser type and generation is of concern.


Apart from LL, LR etc. grammar types, the grammar syntax should be
addressed. IMO EBNF will allow to describe both LL and LR grammars,
whereas BNF may exclude an LL interpretation, due to explicitly unrolled
repetitions (into left or right recursion).


DoDi


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.