From: | David Chase <chase@naturalbridge.com> |
Newsgroups: | comp.compilers |
Date: | 5 Jul 1998 01:17:17 -0400 |
Organization: | NaturalBridge LLC |
References: | 98-07-030 |
Keywords: | symbols, theory |
Warwick Irwin wrote:
>
> Can anyone point me to a reference for an industrial-strength hashing
> algorithm suitable for implementing a symbol table in a C++ parser?
> I would like something that has been tuned for large C++ programs.
Do you want a hash function, or a hashing algorithm?
There was an article, I think in CACM, that spoke well of a
table-based algorithm. Simply fill an array of 256 integers with
random numbers, and use an algorithm along the lines of:
UINT hash(char * s) {
UINT h = 0;
UINT c = *s++;
while (c != 0) {
c = h ^ c;
h = h ^ table[c&255];
c = *s++;
}
}
You can replace the xors with adds or subtracts if you want, you can
replace the c&255 with c>>24. The table full of random numbers gives
you a real leg up, as does the order dependence (that is, the step
where c is combined with the previous hash value to form the index
into the table).
Generating that many random bits could be a bit of a pain; you might
just snag 8 1024-bit PGP keys from random people. That should be good
enough. A dedicated generator of random numbers would flip a number
of coins (don't laugh, I've done it; in the amount of time it takes to
find/code/whatever a good RNG, you can flip an awful lot of coins).
I'm also plenty fond of using residues in GF(2**32), but the algorithm
is very similar to the previous one.
David Chase
NaturalBridge LLC
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.