Re: Grammar for optional elements

"Lowell Thomas" <lowell@coasttocoastresearch.com>
Fri, 22 Jun 2007 08:40:28 -0400

          From comp.compilers

Related articles
[9 earlier articles]
Re: Grammar for optional elements dot@dotat.at (Tony Finch) (2007-06-19)
Re: Grammar for optional elements lowell@coasttocoastresearch.com (Lowell Thomas) (2007-06-19)
Re: Grammar for optional elements dot@dotat.at (Tony Finch) (2007-06-20)
Re: Grammar for optional elements Meyer-Eltz@t-online.de (Detlef Meyer-Eltz) (2007-06-20)
Re: Grammar for optional elements cfc@shell01.TheWorld.com (Chris F Clark) (2007-06-21)
Re: Grammar for optional elements dot@dotat.at (Tony Finch) (2007-06-21)
Re: Grammar for optional elements lowell@coasttocoastresearch.com (Lowell Thomas) (2007-06-22)
Re: Grammar for optional elements cfc@shell01.TheWorld.com (Chris F Clark) (2007-07-02)
| List of all articles for this month |
From: "Lowell Thomas" <lowell@coasttocoastresearch.com>
Newsgroups: comp.compilers
Date: Fri, 22 Jun 2007 08:40:28 -0400
Organization: Compilers Central
References: 07-06-019 07-06-029 07-06-045 07-06-050
Keywords: parse
Posted-Date: 22 Jun 2007 15:34:50 EDT

Tony Finch <dot@dotat.at> wrote:
>... The fix is similar to the one you gave in your
>SABNF version, i.e. replace the attr in attr* expression with an
>expression that doesn't match attr1: (!attr1 attr)


Yes, thanks for making that connection. The expression


*(!A B) A


can be interpreted as, repeat finding B until A is found, then concatenate
to A. It's the equivalent of the *repeat-until* operator of SABNF. There's
more on this in my previous post at (2006-06-03).




Detlef Meyer-Eltz Meyer-Eltz@t-online.de wrote:


>> adding line enders for readability of the input string:
>> attr1 = name1 ":" value ";" CRLF !suffix1
>> suffix1 = *(!name1 any) name1
>> any = %d10-127
>> CRLF = %d13.10
>
>I'm not familiar with ABNF, but I suspect, that adding the
>line enders
>is not only for readability, but is essential for the
>success of the
>example code. Wouldn't a suffix otherwise always consume
>or at least
>look ahead the entire remainder of the string too?


ABNF (Augmented BNF, RFC 4234) is the IETF-preferred grammar form and
is used for the syntax description of most internet protocols. It is
similar in expressiveness to EBNF with a few extra features.
ABNF does not, however, define the syntactic predicate operators ! and &.
Those are features of APG 5.0, taken from PEG, not ABNF.


The line ender makes no difference to the success of the example.
The range for *any* defines ASCII characters 10-127 which includes the
line ender, so it will consume everything. The !name1 term is the only
thing that will stop the repetitions. For a well-formed input, *suffix1*
will always consume the entire remainder of the string but the ! operator
will back track to the beginning character. That's the price you pay for
doing this in the syntax. The game is to try to chose *any* in such a way
as to minimize the hit.


The above *suffix1* actually suffers the defect that *value* is allowed
to contain the sub-string "attribute1", in which
case the test will fail. But this can easily be fixed. In fact, once the
grammar for *value* is given, you can usually find some hook to define a
working *any* which is more efficient than the worst case scenario
which is to use


any = attr


This choice will always solve the original problem in the syntax but
surrenders completely on the optimization challenge. But as much fun
as this problem is, I suppose any application developer with even a
modest performance constraint will do it in the semantics anyway.


Lowell Thomas
www.coasttocoastresearch.com


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.