Re: split tokens; was LR(1) Parsing : Error Handling & Recovery

Evangelos Drikos <drikosev@otenet.gr>
Wed, 03 Sep 2014 16:57:45 +0300

          From comp.compilers

Related articles
[5 earlier articles]
Re: LR(1) Parsing : Error Handling & Recovery drikosev@otenet.gr (Evangelos Drikos) (2014-07-20)
Re: LR(1) Parsing : Error Handling & Recovery haberg-news@telia.com (Hans Aberg) (2014-07-20)
Re: LR(1) Parsing : Error Handling & Recovery drikosev@otenet.gr (Evangelos Drikos) (2014-07-21)
Re: LR(1) Parsing : Error Handling & Recovery haberg-news@telia.com (Hans Aberg) (2014-07-21)
Re: LR(1) Parsing : Error Handling & Recovery drikosev@otenet.gr (Evangelos Drikos) (2014-08-26)
Re: LR(1) Parsing : Error Handling & Recovery haberg-news@telia.com (Hans Aberg) (2014-08-29)
Re: split tokens; was LR(1) Parsing : Error Handling & Recovery drikosev@otenet.gr (Evangelos Drikos) (2014-09-03)
| List of all articles for this month |

From: Evangelos Drikos <drikosev@otenet.gr>
Newsgroups: comp.compilers
Date: Wed, 03 Sep 2014 16:57:45 +0300
Organization: An OTEnet S.A. customer
References: 14-07-023 14-07-024 14-07-030 14-07-031 14-07-038 14-07-039 14-07-040 14-07-049 14-07-051 14-08-009 14-08-010
Keywords: parse, bison
Posted-Date: 03 Sep 2014 10:02:48 EDT

On 8/29/14 12:13 PM, Hans Aberg wrote:
> On 2014/08/26 12:44, Evangelos Drikos wrote:
>> ... What the Bison team said about your idea?
>>...
> ... we discussed split tokens, which it cannot currently handle
> ... The main focus is on computer languages, though it
> would be nice to be able to experiment with natural languages.


Split tokens are also useful for computer languages like SQL which have
unreserved keywords; yet, an efficient solution requires enough code.


Bison & Flex are fast and mature tools that can be combined efficiently
in a simple and straightforward manner for some computer languages.


But, Bison's GLR probably cannot parse some context free grammars. One
such epsilon grammar given by Nozohoor-Farshi[NF91] has broken Tomita's
original algorithm; a slightly modified grammar for Bison is given here:


G1: S
    ;
S: A S 'b'
    | 'x'
    ;
A: {printf("Delayed forever!\n");}
    ;


Given the string "xbbb", a Bison (2.3) GLR parser created by the above
grammar goes into an infinitive loop. Perhaps, this well known issue is
either solved with a hack or not perceived as a bug exactly because the
main focus is on computer languages.


Probably, one could find a plethora of other open source tools for NLP.


Regards,
Ev. Drikos


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.