Re: Opinions about "epsilon" Symbols in Parse Trees

Sylvain Schmitz <schmitz@i3s.unice.fr>
10 Jun 2005 22:15:12 -0400

          From comp.compilers

Related articles
Opinions about "epsilon" Symbols in Parse Trees drikosv@otenet.gr (Evangelos Drikos) (2005-06-09)
Re: Opinions about "epsilon" Symbols in Parse Trees schmitz@i3s.unice.fr (Sylvain Schmitz) (2005-06-10)
Re: Opinions about "epsilon" Symbols in Parse Trees DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-06-10)
Re: Opinions about "epsilon" Symbols in Parse Trees mefrill@yandex.ru (mefrill) (2005-06-12)
Re: Opinions about "epsilon" Symbols in Parse Trees news4e71@yahoo.com (0x4e71) (2005-06-12)
Re: Opinions about "epsilon" Symbols in Parse Trees DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-06-12)
Re: Opinions about "epsilon" Symbols in Parse Trees drikosv@otenet.gr (eDrikos) (2005-06-13)
| List of all articles for this month |

From: Sylvain Schmitz <schmitz@i3s.unice.fr>
Newsgroups: comp.compilers
Date: 10 Jun 2005 22:15:12 -0400
Organization: Compilers Central
References: 05-06-052
Keywords: parse,
Posted-Date: 10 Jun 2005 22:15:12 EDT

Evangelos Drikos wrote:
> (1) <sign> ::= - | + | NONE /* where NONE is the "epsilon" */
>
> (2) <integer> ::= <sign> { 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 } .
>
> Also, in this case, if the parser read an unsigned integer it doesn't
> represent it on the Parse Tree.
>
> Though, it could (or must) probably reduce the rule "<sign> -> NONE".


Yes it should.


> It was a straight forward example. But, let us define the Syntax Rule
> of an identifier that consists of a <Latin Letter> optionally followed
> by a sequence of <Latin Letter> or <digit>.
>
> The e-free Syntax Rule:
> "<Identifier> ::= <Latin Letter> [ { <Latin Letter> | <digit> } . ] "
> can be restated as:
>
> <Identifier> ::= <Identifier start> <Identifier part>
> <Identifier start> ::= <Latin Letter>
> <Identifier part> ::= { <Latin Letter> | <digit> | NONE } .
> /* where NONE is the "epsilon" */
>
> How many times should the Parser reduce the production "<Identifier
> part>-> NONE" when it read an identifier consisting of only one <Latin
> Letter>?
>
> None, one and why not four times?


This grammar snippet is highly ambiguous: it allows an unbounded
number of consecutive reductions of the empty string.


      A proper use of the empty string could be
<Identifier part> ::= <Identifier part> <Identifier char> | NONE
<Identifier char> ::= <Latin Letter> | <digit>,
in which case you would only see one reduction of the empty string.


      Now, since you are using an extended syntax, it might be wiser not
to allow empty rules, so that users would have to use your sequence
and option constructs. Having empty strings in the parse tree can
come in handy, for null-initialization of the list of elements in your
sequence construct for instance.


--
Hope that helps,


      Sylvain


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.