|Masters course with compiler specialization firstname.lastname@example.org (Jeremy Wright) (2002-11-12)|
|Re: Masters course with compiler specialization Trevor.Jenkins@suneidesis.com (2002-12-11)|
|Re: Masters course with compiler specialization email@example.com (2002-12-19)|
|Size of hash tables was Re: Masters course with compiler specializat Trevor.Jenkins@suneidesis.com (2002-12-22)|
|Re: Size of hash tables was Re: Masters course ... firstname.lastname@example.org (Joachim Durchholz) (2002-12-30)|
|Re: Size of hash tables was Re: Masters course ... email@example.com (Matthias Neeracher) (2003-01-04)|
|Re: Size of hash tables was Re: Masters course ... firstname.lastname@example.org (2003-01-04)|
|Re: Size of hash tables was Re: Masters course ... email@example.com (Stephan Eggermont) (2003-01-07)|
|From:||Trevor.Jenkins@suneidesis.com (Trevor Jenkins)|
|Date:||22 Dec 2002 10:40:18 -0500|
|References:||02-11-060 02-12-056 02-12-092|
On 19 Dec 2002 12:41:25 -0500, Sebastiano Pilla <firstname.lastname@example.org> wrote:
> Trevor Jenkins <Trevor.Jenkins@suneidesis.com> wrote:
> > If Hopgood continues to teach this module then you'll at least come
> > out with the correct view that hash-tables based on the powers of 2
> > are better than Maurer's gospel that such tables can only ever be
> > absed on prime numbers.
> Could you please expand on that? I don't think I'll have the opportunity
> to attend Hopgood's course.
Since the publication of Maurer's paper "An improved hash code for
scatter storage" in the Comm of the ACM (vol 11, Jan 1968, pp 35--38)
it is taken as gospel that hash tables only work if the size is a
prime number. Ignoring for the moment any attempt to change the size
of a hash table the MODULO will require a divide operation. Divides
are (at least were) expensive operations to perform in hardware. For
the not so subtle influenceof this aper check out the Dragon books
where Aho/Ullman and Aho/Sethi/Ullman only mention the use of prime
numbers for the size of a hash table.
An alternative to prime numbers is to use powers of 2, which if I
remember Hopgood's lectures correctly were used in the Fortran II
compiler. Already this has become effecient. Why? The divide operation
can be replaced by the faster boolean AND operation. Plus there is no
serious penalty for increaing the size of the hash table. To increase
the size of a hash table using powers of 2 requires setting the next
bit along whereas with the Maurer doctrine you'll have to compute the
next prime number in sequence.
There are other benefits of using powers of two. Carefully planning
can make sure that the hash table starts on a page bondary and there
after an integral number of pages can be processed. Rehashing the
tabel will be faster for the same reason that inserting an individual
entry is faster.
For more information on Hopgood's approach see:
"The quadratic hash method when the table size is a power of 2"
Hopgood and Davenport, Computer Journal, vol 15, nov 1972, pp314--315
I do remember Hopgood saying that he wrtote the editor of Comm of the
ACM about Maurer's paper challenging the validity of the content. A
letter that so far remains unprinted --- even when Maurer's was
reprinted in a annivarsary issue of Comm of the ACM.
Return to the
Search the comp.compilers archives again.