Related articles |
---|
Masters course with compiler specialization jeremy.wright@microfocus.com (Jeremy Wright) (2002-11-12) |
Re: Masters course with compiler specialization Trevor.Jenkins@suneidesis.com (2002-12-11) |
Re: Masters course with compiler specialization etechweb@yahoo.com (2002-12-19) |
Size of hash tables was Re: Masters course with compiler specializat Trevor.Jenkins@suneidesis.com (2002-12-22) |
Re: Size of hash tables was Re: Masters course ... joachim_d@gmx.de (Joachim Durchholz) (2002-12-30) |
Re: Size of hash tables was Re: Masters course ... neeri@iis.ee.ethz.ch (Matthias Neeracher) (2003-01-04) |
Re: Size of hash tables was Re: Masters course ... bonzini@gnu.org (2003-01-04) |
Re: Size of hash tables was Re: Masters course ... stephan@stack.nl (Stephan Eggermont) (2003-01-07) |
From: | bonzini@gnu.org (Paolo Bonzini) |
Newsgroups: | comp.compilers |
Date: | 4 Jan 2003 22:47:23 -0500 |
Organization: | http://groups.google.com/ |
References: | 02-11-060 02-12-056 02-12-092 02-12-107 02-12-127 |
Keywords: | symbols, performance |
Posted-Date: | 04 Jan 2003 22:47:23 EST |
> > Since the publication of Maurer's paper "An improved hash code for
> > scatter storage" in the Comm of the ACM (vol 11, Jan 1968, pp 35--38)
> > it is taken as gospel that hash tables only work if the size is a
> > prime number.
>
> Not "only work". Just "distribute their keys in a more random fashion,
> assuming you don't have a priori knowledge about key distribution".
You can always design a good-enough hash function so that the
distribution of the hash-function does not require the additional
bit-shuffling that a modulo-prime operation does. Considering strings
as keys, you can say that prime sizes are only required when a
character only affects a few specific and clustered bits, instead of
having some chance of affecting every bit.
> Particularly on modern hardware, where division and bit masking have
> roughly the same execution cost. Could anybody clarify?
Only roughly (and only if you can parallelize things well -- a
division has some cycles of latency, while you can do several
bitmaskings in the same clock cycle). Why do things roughly if it's
simple to do them better? :-)
You can also do double-hashing with power-of-two hash tables, only
make sure that you set the low bit of the secondary hash. e.g. size is
8, primary hash is 3, secondary hash is 4 --> does not work, you only
examine two items; but if secondary hash is 5, you examine eight
buckets (in this order: 3, 0, 5, 2, 7, 4, 1, 6).
Paolo
Return to the
comp.compilers page.
Search the
comp.compilers archives again.