Compiling expressions

James Harris <>
Sat, 29 Dec 2012 05:11:16 -0800 (PST)

          From comp.compilers

Related articles
Compiling expressions (James Harris) (2012-12-29)
Re: Compiling expressions (glen herrmannsfeldt) (2012-12-29)
Re: Compiling expressions (Dmitry A. Kazakov) (2012-12-30)
Re: Compiling expressions (James Harris) (2013-01-02)
Re: Compiling expressions (James Harris) (2013-01-02)
Re: Compiling expressions ( (2013-01-03)
Re: Compiling expressions (2013-01-03)
[6 later articles]
| List of all articles for this month |

From: James Harris <>
Newsgroups: comp.compilers
Date: Sat, 29 Dec 2012 05:11:16 -0800 (PST)
Organization: Compilers Central
Keywords: parse, question
Posted-Date: 29 Dec 2012 16:55:05 EST

Compiling expressions is turning out to be more 'interesting' than I
anticipated! My requirements I believe to be fairly generic but they
seem not to be supported by standard algorithms so it's not as simple
as it might be. I thought I would post here as much out of interest as
anything. I'm not after a prebuilt solution but would be interested to
hear from other folks who have had similar issues to address. The
requirements are:

1. Hand-written, not the output of a parser generator.
2. Efficient and without backtracking.
3. Precedences (and possibly associativities) defined in tables.
4. Output to be a tree structure.
5. Parenthesised subexpressions allowed.
6. Some operator families are *not* to associate with each other. See
7. Monadic prefix, dyadic infix and monadic postfix operators are all
8. Prefix and infix operators can use some same symbols (e.g. minus

Infix and postfix operators use distinct symbols. For example, if a
certain symbol were used as a postfix operator it could not also be
used as an infix operator.

Point 6 about some operator families not associating is because, at
least in the proposed version of the language, bitwise operators

    and, xor, or, not

are not to have any defined precedence relative to the arithmetic

    plus, minus, times, power, unary minus etc

Operators in these two families are to have precedences relative to
other symbols in the same family and to families below and above them
but there is no defined precedence between, say, plus and bitwise and.
Some examples of operators of lower precedence would be comparisons
(less than, equal to etc) and boolean conjunctions (logical and,
logical or etc). Operators of higher precedence include binding,
dereference, function application etc.

So if the compiler sees

    a + b & c

then it is to complain about + and & being adjacent. One of the sides
must be parenthesised such as either of these two:

    (a + b) & c
    a + (b & c)

By contrast the following are both OK without needing parens because <
can interact with either family.

    a + b < x
    b & c < x

I have considered: recursive calls, shunting yard stacking of
operators, stacking of pairs (symbol, operand), of triples
(precedence, symbol, operand), some bitwise modification of stacked
precedences and some table-driven recognition of exceptions.

I have tried to avoid a two-dimensional grid-based solution but having
concluded it may be the way to go my copy of the dragon book advises
that such a solution does not work with unary minus! It says it should
be separated out in the lexer. That is something I don't want to do.
The parsing should be controlled wholly in the expression parser.

Perhaps strangely prefix operators seem to me easy enough to
distinguish from infix as they are seen in two different contexts.
What is acceptable before an operand is not the same as what is
acceptable after an operand.

Am currently thinking this might need a grid per context - i.e. two
grids - but I've never seen an algorithm with such. Er, fun this,
isn't it!


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.