Re: Looking for symbol table hashing algorithm

David Chase <chase@naturalbridge.com>
5 Jul 1998 01:17:17 -0400

          From comp.compilers

Related articles
Looking for symbol table hashing algorithm wal@cosc.canterbury.ac.nz (Warwick Irwin) (1998-07-03)
Re: Looking for symbol table hashing algorithm dwight@pentasoft.com (1998-07-03)
Re: Looking for symbol table hashing algorithm drunelson@my-dejanews.com (1998-07-03)
Re: Looking for symbol table hashing algorithm jmccarty@sun1307.spd.dsccc.com (1998-07-03)
Re: Looking for symbol table hashing algorithm zs@cs.mu.OZ.AU (1998-07-05)
Re: Looking for symbol table hashing algorithm chase@naturalbridge.com (David Chase) (1998-07-05)
Re: Looking for symbol table hashing algorithm dietz@interaccess.com (Paul Dietz) (1998-07-05)
Re: Looking for symbol table hashing algorithm khays@sequent.com (1998-07-05)
Re: Looking for symbol table hashing algorithm eodell@pobox.com (1998-07-08)
Re: Looking for symbol table hashing algorithm preston@tera.com (1998-07-08)
Re: Looking for symbol table hashing algorithm drh@microsoft.com (Dave Hanson) (1998-07-08)
Re: Looking for symbol table hashing algorithm smith@aragorn.uni-trier.de (1998-07-10)
[7 later articles]
| List of all articles for this month |

From: David Chase <chase@naturalbridge.com>
Newsgroups: comp.compilers
Date: 5 Jul 1998 01:17:17 -0400
Organization: NaturalBridge LLC
References: 98-07-030
Keywords: symbols, theory

Warwick Irwin wrote:
>
> Can anyone point me to a reference for an industrial-strength hashing
> algorithm suitable for implementing a symbol table in a C++ parser?
> I would like something that has been tuned for large C++ programs.


Do you want a hash function, or a hashing algorithm?


There was an article, I think in CACM, that spoke well of a
table-based algorithm. Simply fill an array of 256 integers with
random numbers, and use an algorithm along the lines of:


UINT hash(char * s) {
      UINT h = 0;
      UINT c = *s++;
      while (c != 0) {
            c = h ^ c;
            h = h ^ table[c&255];
            c = *s++;
      }
}


You can replace the xors with adds or subtracts if you want, you can
replace the c&255 with c>>24. The table full of random numbers gives
you a real leg up, as does the order dependence (that is, the step
where c is combined with the previous hash value to form the index
into the table).


Generating that many random bits could be a bit of a pain; you might
just snag 8 1024-bit PGP keys from random people. That should be good
enough. A dedicated generator of random numbers would flip a number
of coins (don't laugh, I've done it; in the amount of time it takes to
find/code/whatever a good RNG, you can flip an awful lot of coins).


I'm also plenty fond of using residues in GF(2**32), but the algorithm
is very similar to the previous one.


David Chase
NaturalBridge LLC
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.