Re: Implementing an "in" operator

"Steve Siegfried" <sos@zjod.net>24 Nov 2002 01:20:38 -0500

From comp.compilers

Related articles
Implementing an "in" operator ed_davis2@yahoo.com (Ed Davis) (2002-11-13)
Re: Implementing an "in" operator n502640813.ch@chch.demon.co.uk (Charles Bryant) (2002-11-20)
Re: Implementing an "in" operator sos@zjod.net (Steve Siegfried) (2002-11-24)
Re: Implementing an "in" operator ed_davis2@yahoo.com (Ed Davis) (2002-11-26)
| List of all articles for this month |

 From: "Steve Siegfried" Newsgroups: comp.compilers Date: 24 Nov 2002 01:20:38 -0500 Organization: Ziggy's Auto Body and Tanning References: 02-11-079 Keywords: code Posted-Date: 24 Nov 2002 01:20:38 EST

> Any ideas on how one would implement an "in" operator?
>
> As in (no pun intended):
>
> if (a in b,c,d..e)
> statements;
>
> translates to:
>
> if (a == b or a == c or (a >= d and a <=e))
> statements;
>
> I don't have a problem with unary and binary operators, but
> this one has got me stumped. Part of my simple
> (integer-only) expression parser is shown below:
>
> static Tree *expr(int p) {
> Tree *t = primary(); /* also handles unary operators */
> while (isBinaryOperator(tok.sym) &&
> precedence(tok.sym) >= p) {
> const Symtype op = tok.sym;
> getsym();
> t = mkBinaryNode(op, t,
> expr(associativity(op) == Right ?
> precedence(op) : 1 + precedence(op)));
> }
> return t;
> }
>
> Any ideas on how "in" might be implemented in the above?
> [Treat comma and .. as operators long enough to build up a tree of
> stuff you can translate when you deal with the "in", perhaps. -John]

You're making it too complicated.

All the elements in a set form an enumerated type that "names" all
possible set members with either an implicit enumeration (think: "set
of integers") or an explicit one (think: "set of crayons in a box").

The traditional way of implementing this is with a bit-vector, wherein
each element of the set gets a dedicated bit in a (usually
fixed-length and usually 0 based, but sometimes variable length and
non-0 based and sometimes even sparse) bit-vector.

When a bit is 1, it means the corresponding element is a member of the
set. When that same bit is 0, it means that the element isn't a
member of the set. Thus, the i'th member of the enumeration forms the
i'th member of the set (and usually occupies the i'th bit).

Expressions operating on bit-sets then become relatively straightforward
for code generators (elements are lowercase, SETS aren't):

expression C translation (assumes sets fit in an unsigned long)
================= ====================================================
[a] (1 << a)
[b,c] ((1 << b) | (1 << c))
a in X ((1 << a) & X)?1:0
union(A,B) (A | B)
intersection(A,B) (A & B)
empty(A) (A?:1:0)
pred(a) (((a-1) >= 0)?(a-1):runtime_error("pred", a))
et cetera

Parsing this then becomes following the BNF (which I'm sure your
professor gave you when he assigned the problem) and generating the IR
as if "in" was any other binary operator.

Thus the whole trick behind implementing sets in a compiler is not in
the operators, but in the internal representation of sets and
set-constants.

Wondering if anyone remembers Pascal'idly,

-S

Post a followup to this message