|Pascal compiler and sets firstname.lastname@example.org (1994-02-08)|
|Re: Pascal compiler and sets email@example.com (1994-02-08)|
|Re: Pascal compiler and sets firstname.lastname@example.org (1994-02-08)|
|Re: Pascal compiler and sets email@example.com (1994-02-09)|
|Pascal compiler and sets firstname.lastname@example.org (1994-02-09)|
|Re: Pascal compiler and sets email@example.com (1994-02-10)|
|Re: Pascal compiler and sets firstname.lastname@example.org (1994-02-10)|
|Re: Pascal compiler and sets synaptx!thymus!daveg@uunet.UU.NET (Dave Gillespie) (1994-02-10)|
|From:||Dave Gillespie <synaptx!thymus!daveg@uunet.UU.NET>|
|Date:||Thu, 10 Feb 1994 22:15:45 GMT|
Dave Allison asks about implementing sets in Pascal.
When I wrote a Pascal compiler a few years ago the problem of sets seemed
daunting and mysterious, but I found a good way to think about it: Pretend
you are implementing a "string" type, but with bits instead of characters
as the components. Many Pascal dialects have a dynamically-sized string
var s1 : string;
s1 := 'howdy, world';
if s1 = 'foo' then ...
A string is typically an array of 101 bytes where the first is a
All the problems of set allocation come up with strings, but somehow
strings seem to be easier to visualize and think about.
I know at least one commercial compiler which implements sets as vectors
of words where the first word is a length; this is also how I implement
sets larger than 32 elements in p2c. All the various set operations, like
"setcpy", "setequal", "setunion", and so on, are written in terms of
these. The length word effectively gives the position of the last
non-zero word in the bit vector; it helps with the case where the set is
large but mostly empty. (For example, a "set of char" needs space for 256
bits, but the latter 128 bits will hardly ever be used in practice.)
If you want range checking, have your "setcpy" function take the maximum
allowable size of the destination as an extra argument.
Set operators like "s1 + s2" need to allocate temporaries, but for each
operator it's easy for the compiler to determine statically a bound on the
size of the temporary. (The larger of the sizes for "+", the smaller of
the sizes for "*", etc.) In p2c, I had my code generator create extra
temporaries and then optimize them away later; this works well for both
sets and strings:
str1 := str2 + str3
=> strcpy(str1, sprintf(TEMP, "%s%s", str2, str3))
=> sprintf(str1, "%s%s", str2, str3)
set1 := set2 + set3
=> setcpy(set1, setunion(TEMP, set2, set3))
=> setunion(set1, set2, set3)
where again you might want to pass the maximum size of the destination as
an extra argument to "setunion".
Set literals are not too much trickier to deal with than string literals.
The fact that "" is typeless is a bit of a nuisance, but then, no more
so than the "nil" pointer literal.
Return to the
Search the comp.compilers archives again.