Using LALR machine to disambiguate tokens

"Monty Hall" <chickenkungpao@hotmail.com>
7 Sep 2004 23:58:33 -0400

From comp.compilers

Related articles
*Using LALR machine to disambiguate tokens chickenkungpao@hotmail.com (Monty Hall)* (2004-09-07)**

| List of all articles for this month |

From:	"Monty Hall" <chickenkungpao@hotmail.com>
Newsgroups:	comp.compilers
Date:	7 Sep 2004 23:58:33 -0400
Organization:	SBC http://yahoo.sbc.com
Keywords:	LALR
Posted-Date:	07 Sep 2004 23:58:33 EDT

        Just finished an LALR(k) dfa generator that also generates lexer regular
expression dfa in hopes of creating an integrated parse/lex rapid
development tool that's relatively 'hands free'. One thing that I am toying
with is disambiguating tokens. From the RE/grammar bnf snippet below:

    string = [a-z]+
    <start> ::= 'max' 'lookahead' '=' int
                            | 'start' 'rule' '=' int
                            | string '=' int

        When tokens may assume only one accept symbol, I simply find it annoying
that max, lookahead, start, and rule, are in string's dfa. One common
solution that I've seen is:

    <start> ::= string string
            { string[0] = 'max' && string[1] = 'lookahead' .....}

        I was thinking of using the LALR(k) machine to disambiguate
tokens. It could be done by adding a bitmask to each LALR state for
allowable input and using lookahead if the bitmasking yields a truly
ambiguous token. Does anybody have information on the topic of token
disambiguation or parsing keywordless programming languages(pitfalls,
concerns & considerations) and if possible as it relates to a LR
machine?

Regards,

Monty
chickenkungpao@hotmail.com

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Using LALR machine to disambiguate tokens

"Monty Hall" <chickenkungpao@hotmail.com>7 Sep 2004 23:58:33 -0400

"Monty Hall" <chickenkungpao@hotmail.com>
7 Sep 2004 23:58:33 -0400