Related articles |
---|
[9 earlier articles] |
Re: Hash specifics bart@cs.uoregon.edu (1990-12-13) |
Re: Hash specifics roth@resi.rice.edu (1990-12-14) |
Re: Hash specifics oz@nexus.yorku.ca (1990-12-14) |
Re: Hash specifics plx!plxsun!evan@Sun.COM (1990-12-15) |
Re: Hash specifics vern@horse.ee.lbl.gov (1990-12-16) |
Re: Hash specifics schmidt@oberkampf.ICS.UCI.EDU (Doug Schmidt) (1990-12-17) |
Re: Hash specifics preston@ariel.rice.edu (1990-12-17) |
Re: Hash specifics brain@eos.ncsu.edu (1990-12-18) |
Re: Hash specifics rfpfeifl@watcgl.uwaterloo.ca (Ron Pfeifle) (1990-12-21) |
Newsgroups: | comp.compilers |
From: | preston@ariel.rice.edu (Preston Briggs) |
Keywords: | design |
Organization: | Rice University, Houston |
References: | <1990Dec12.221425.569@zoo.toronto.edu> <2980@lupine.NCD.COM> <EVAN.90Dec15155336@plxsun.uucp> |
Date: | Mon, 17 Dec 90 15:58:58 GMT |
plx!plxsun!evan@Sun.COM (Evan Bigall) writes:
>The time when I always find myself looking for a perfect hash function, is
>when I am parsing long keyword options (ie: -ansi etc...) or some other case
>when I have a small set of possibilities but want to avoid using:
>
> if (!strcmp("hello", arg))
> {}
> else if (!strcmp("there", arg))
> {}
> else if (!strcmp("world", arg))
Another poster (Ira Baxter?) mentioned how to handle these efficiently
in a typical compiler-like case. I'll expand a little.
Since we're building a symbol table anyway, we're going to need to hash every
identifier anyway. I use one hash table to hash keywords, identifiers, and
strings as part of my scanner. Each time a new id or string is encountered,
I allocate a new integer for it. If the integer is < some small number, an
id is actually a keyword.
The rest of the front-end simply looks at integers which serve as indices
into a string table. Makes it easy to switch on "strings" when all are
reduced to integers.
This works especially well with Knuth's WEB system. WEB has a notion of
preprocessed-strings. I.e.
"a string"
When Tangling, a string table is maintained and all the strings are replaced
in the output text with integers. Further, the string table is written to a
pool file, suitable for initializing your compiler's hash table (or TeX or
whatever).
Pre-processed strings are also a nice way to handle error messages and so
forth, potentially saving much storage.
LISPers will wonder what the fuss is about, as they use symbols as a matter
of course.
Preston Briggs
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.