Related articles |
---|
[9 earlier articles] |
Re: Grammar for optional elements dot@dotat.at (Tony Finch) (2007-06-19) |
Re: Grammar for optional elements lowell@coasttocoastresearch.com (Lowell Thomas) (2007-06-19) |
Re: Grammar for optional elements dot@dotat.at (Tony Finch) (2007-06-20) |
Re: Grammar for optional elements Meyer-Eltz@t-online.de (Detlef Meyer-Eltz) (2007-06-20) |
Re: Grammar for optional elements cfc@shell01.TheWorld.com (Chris F Clark) (2007-06-21) |
Re: Grammar for optional elements dot@dotat.at (Tony Finch) (2007-06-21) |
Re: Grammar for optional elements lowell@coasttocoastresearch.com (Lowell Thomas) (2007-06-22) |
Re: Grammar for optional elements cfc@shell01.TheWorld.com (Chris F Clark) (2007-07-02) |
From: | "Lowell Thomas" <lowell@coasttocoastresearch.com> |
Newsgroups: | comp.compilers |
Date: | Fri, 22 Jun 2007 08:40:28 -0400 |
Organization: | Compilers Central |
References: | 07-06-019 07-06-029 07-06-045 07-06-050 |
Keywords: | parse |
Posted-Date: | 22 Jun 2007 15:34:50 EDT |
Tony Finch <dot@dotat.at> wrote:
>... The fix is similar to the one you gave in your
>SABNF version, i.e. replace the attr in attr* expression with an
>expression that doesn't match attr1: (!attr1 attr)
Yes, thanks for making that connection. The expression
*(!A B) A
can be interpreted as, repeat finding B until A is found, then concatenate
to A. It's the equivalent of the *repeat-until* operator of SABNF. There's
more on this in my previous post at (2006-06-03).
Detlef Meyer-Eltz Meyer-Eltz@t-online.de wrote:
>> adding line enders for readability of the input string:
>> attr1 = name1 ":" value ";" CRLF !suffix1
>> suffix1 = *(!name1 any) name1
>> any = %d10-127
>> CRLF = %d13.10
>
>I'm not familiar with ABNF, but I suspect, that adding the
>line enders
>is not only for readability, but is essential for the
>success of the
>example code. Wouldn't a suffix otherwise always consume
>or at least
>look ahead the entire remainder of the string too?
ABNF (Augmented BNF, RFC 4234) is the IETF-preferred grammar form and
is used for the syntax description of most internet protocols. It is
similar in expressiveness to EBNF with a few extra features.
ABNF does not, however, define the syntactic predicate operators ! and &.
Those are features of APG 5.0, taken from PEG, not ABNF.
The line ender makes no difference to the success of the example.
The range for *any* defines ASCII characters 10-127 which includes the
line ender, so it will consume everything. The !name1 term is the only
thing that will stop the repetitions. For a well-formed input, *suffix1*
will always consume the entire remainder of the string but the ! operator
will back track to the beginning character. That's the price you pay for
doing this in the syntax. The game is to try to chose *any* in such a way
as to minimize the hit.
The above *suffix1* actually suffers the defect that *value* is allowed
to contain the sub-string "attribute1", in which
case the test will fail. But this can easily be fixed. In fact, once the
grammar for *value* is given, you can usually find some hook to define a
working *any* which is more efficient than the worst case scenario
which is to use
any = attr
This choice will always solve the original problem in the syntax but
surrenders completely on the optimization challenge. But as much fun
as this problem is, I suppose any application developer with even a
modest performance constraint will do it in the semantics anyway.
Lowell Thomas
www.coasttocoastresearch.com
Return to the
comp.compilers page.
Search the
comp.compilers archives again.