Re: terminological problem (EBNF & regular expressions)

RLake@oxfam.org.uk
26 Oct 2005 14:42:55 -0400

          From comp.compilers

Related articles
[2 earlier articles]
Re: terminological problem (EBNF & regular expressions) paul@parsetec.com (Paul Mann) (2005-10-17)
Re: terminological problem (EBNF & regular expressions) paul@parsetec.com (Paul Mann) (2005-10-19)
Re: terminological problem (EBNF & regular expressions) Meyer-Eltz@t-online.de (Detlef Meyer-Eltz) (2005-10-19)
Re: terminological problem (EBNF & regular expressions) paul@parsetec.com (Paul Mann) (2005-10-20)
Re: terminological problem (EBNF & regular expressions) Meyer-Eltz@t-online.de (Detlef Meyer-Eltz) (2005-10-23)
Re: terminological problem (EBNF & regular expressions) paul@parsetec.com (Paul Mann) (2005-10-26)
Re: terminological problem (EBNF & regular expressions) RLake@oxfam.org.uk (2005-10-26)
| List of all articles for this month |
From: RLake@oxfam.org.uk
Newsgroups: comp.compilers
Date: 26 Oct 2005 14:42:55 -0400
Organization: Compilers Central
References: 05-10-104 05-10-125 05-10-137 05-10-147
Keywords: parse
Posted-Date: 26 Oct 2005 14:42:55 EDT

>>> LRgen allows one to do it like this: [X|Y|Z]/','...
>> It seems like one would have to say


>> ((X|Y|Z)(,(X|Y|Z))*)?


>> to accomplish the same thing.


Ah, yes... I've also seen that operator written as // and \\. I think it's
called "interpolate" but I don't know where I remember that from. In
effect, it is the inverse to the Haskell "intersperse" (often called "join"
in scripting languages).


It's interesting that if you have an epsilon (empty) constant, you can
reduce Kleene regular expressions to three binary operators:


concatenate
alternate
interpolate


because, using traditional regex notation and // for interpolate:


a? ==> a | epsilon
a* ==> epsilon // a
a+ ==> a // epsilon


I don't know that users like the parsimony of expression, but it is
convenient for implementation in languages which allow operator overriding
but don't like *, ? and + as postfix operators.


I'm a little surprised that it isn't in more common use. There are few uses
in lexers, but it comes up all the time in grammars, not just for
expression lists and parameter lists. For example, the Lua if statement:


    if-statement ::= "if"
                                                ( expression "then" block // "elseif" )
                                                ( "else" expression | epsilon )
                                      "end"


which leads to a possible formulation for C / Java if statements:


    if-statement ::= "if" ( "(" expression ")" statement' // "else" "if" )
                                                ( "else" statement' | epsilon )


(where statement' is all statements other than if-statements)


Also, for many purposes, this is sufficiently detailed:


    expression ::= term // binary-operator


But it can be parameterized (if this rule is part of the grammar standard
library, the
user only need define the binary-operator[p] classes, and at some point
decide about
their associativity):


    expression[p] ::= expression[p+1] // binary-operator[p]


(I think I got that from Lex Augusteijn's Front system, but I can't be sure
any more.)


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.