Related articles |
---|
Work around ambiguities in LL(1) assembler parser m@il.it (=?iso-8859-1?b?RnLpZOlyaWM=?=) (2008-01-06) |
Re: Work around ambiguities in LL(1) assembler parser wyrmwif@tsoft.org (SM Ryan) (2008-01-07) |
Re: Work around ambiguities in LL(1) assembler parser DrDiettrich1@aol.com (Hans-Peter Diettrich) (2008-01-07) |
Re: Work around ambiguities in LL(1) assembler parser gene.ressler@gmail.com (Gene) (2008-01-07) |
From: | Gene <gene.ressler@gmail.com> |
Newsgroups: | comp.compilers |
Date: | Mon, 7 Jan 2008 17:33:44 -0800 (PST) |
Organization: | Compilers Central |
References: | 08-01-015 |
Keywords: | assembler, parse |
Posted-Date: | 08 Jan 2008 12:25:19 EST |
On Jan 6, 6:04 pm, Fridiric <m...@il.it> wrote:
> Writing an assembler, I'm stuck with ambiguities with my syntax. The
> syntactic analysis is a simple LL(1) recursive descent parser. Here is
> the issue (look at the "(" as a FIRST)
>
> LDA (data),Y ; #1 Indirect address indexed by Y
>
> index EQU 123
> LDA (index-1)*3,Y ; #2 Absolute address indexed by Y
> LDA 3*(index-1),Y ; Equivalent but non ambiguous
>
> An except of the grammar :
>
> line = mnemonic oper
> oper = "(" addr ")" "," "Y"
> | addr "," "Y"
> addr = ["+"|"-"] term { ["+"|"-"] term }
> ... etc
>
> The "(" of the _oper_ rule conflicts with the one from the _addr_
> expression. Do I get this right ?
>
> Is there a work around ?
Yeah.
I am assuming your grammar ends with
term = factor { [ * | / ] factor }
factor = id | numliteral | ( addr )
and that your semantics are such that any expression with outer parens
implies an indirect address.
There are a couple of solutions.
The best is to change the syntax. E.g. use square brackets for
indirect addresses. Ambiguity is a clue that human parsers will have
as much trouble reading the code as the machine.
Otherwise, the purely syntactic one is to come up with an expression
grammar that precludes outer parentheses. This is possible, but
messy. Before left-factoring you'd have something like
addr = T [+|-] T { [+|-] T } | T'
T' = F [*|/] T | F'
F' = id | numliteral
T = F { [*|/] F }
F = ( addr ) | F'
Much simpler, as another poster suggested, Is to parse all operands as
addr and then check after the fact to see if there were surrounding
parens.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.