Re: Work around ambiguities in LL(1) assembler parser

Gene <gene.ressler@gmail.com>
Mon, 7 Jan 2008 17:33:44 -0800 (PST)

          From comp.compilers

Related articles
Work around ambiguities in LL(1) assembler parser m@il.it (=?iso-8859-1?b?RnLpZOlyaWM=?=) (2008-01-06)
Re: Work around ambiguities in LL(1) assembler parser wyrmwif@tsoft.org (SM Ryan) (2008-01-07)
Re: Work around ambiguities in LL(1) assembler parser DrDiettrich1@aol.com (Hans-Peter Diettrich) (2008-01-07)
Re: Work around ambiguities in LL(1) assembler parser gene.ressler@gmail.com (Gene) (2008-01-07)
| List of all articles for this month |
From: Gene <gene.ressler@gmail.com>
Newsgroups: comp.compilers
Date: Mon, 7 Jan 2008 17:33:44 -0800 (PST)
Organization: Compilers Central
References: 08-01-015
Keywords: assembler, parse
Posted-Date: 08 Jan 2008 12:25:19 EST

On Jan 6, 6:04 pm, Fridiric <m...@il.it> wrote:


> Writing an assembler, I'm stuck with ambiguities with my syntax. The
> syntactic analysis is a simple LL(1) recursive descent parser. Here is
> the issue (look at the "(" as a FIRST)
>
> LDA (data),Y ; #1 Indirect address indexed by Y
>
> index EQU 123
> LDA (index-1)*3,Y ; #2 Absolute address indexed by Y
> LDA 3*(index-1),Y ; Equivalent but non ambiguous
>
> An except of the grammar :
>
> line = mnemonic oper
> oper = "(" addr ")" "," "Y"
> | addr "," "Y"
> addr = ["+"|"-"] term { ["+"|"-"] term }
> ... etc
>
> The "(" of the _oper_ rule conflicts with the one from the _addr_
> expression. Do I get this right ?
>
> Is there a work around ?


Yeah.


I am assuming your grammar ends with


          term = factor { [ * | / ] factor }
          factor = id | numliteral | ( addr )


and that your semantics are such that any expression with outer parens
implies an indirect address.


There are a couple of solutions.


The best is to change the syntax. E.g. use square brackets for
indirect addresses. Ambiguity is a clue that human parsers will have
as much trouble reading the code as the machine.


Otherwise, the purely syntactic one is to come up with an expression
grammar that precludes outer parentheses. This is possible, but
messy. Before left-factoring you'd have something like


addr = T [+|-] T { [+|-] T } | T'
T' = F [*|/] T | F'
F' = id | numliteral
T = F { [*|/] F }
F = ( addr ) | F'


Much simpler, as another poster suggested, Is to parse all operands as
addr and then check after the fact to see if there were surrounding
parens.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.