Related articles |
---|
Looking for symbol table hashing algorithm wal@cosc.canterbury.ac.nz (Warwick Irwin) (1998-07-03) |
Re: Looking for symbol table hashing algorithm dwight@pentasoft.com (1998-07-03) |
Re: Looking for symbol table hashing algorithm drunelson@my-dejanews.com (1998-07-03) |
Re: Looking for symbol table hashing algorithm jmccarty@sun1307.spd.dsccc.com (1998-07-03) |
Re: Looking for symbol table hashing algorithm zs@cs.mu.OZ.AU (1998-07-05) |
Re: Looking for symbol table hashing algorithm chase@naturalbridge.com (David Chase) (1998-07-05) |
Re: Looking for symbol table hashing algorithm dietz@interaccess.com (Paul Dietz) (1998-07-05) |
Re: Looking for symbol table hashing algorithm khays@sequent.com (1998-07-05) |
Re: Looking for symbol table hashing algorithm eodell@pobox.com (1998-07-08) |
Re: Looking for symbol table hashing algorithm preston@tera.com (1998-07-08) |
Re: Looking for symbol table hashing algorithm drh@microsoft.com (Dave Hanson) (1998-07-08) |
[8 later articles] |
From: | zs@cs.mu.OZ.AU (Zoltan Somogyi) |
Newsgroups: | comp.compilers |
Date: | 5 Jul 1998 01:13:24 -0400 |
Organization: | Computer Science, The University of Melbourne |
References: | 98-07-030 |
Keywords: | symbols |
>[My impression is that so long as you avoid obviously horrible hashes
>that do things like making all single letter symbols collide, it
>doesn't make much difference. -John]
That would be true, except that the set of "obviously horrible" hashes
is quite large. Real data sets exhibits several kinds of similarity
between strings to be hashed, and designing an algorithm that
successfully deals with *all* these similarities is a non-trivial
exercise.
A paper that discusses these issues is:
Performance in practice of string hashing functions'',
M.V. Ramakrishna and J. Zobel, Proceedings of the Fifth International
Conference on Database Systems for Advanced Applications, Melbourne,
Australia, April 1997, pp. 215--223.
Its abstract reads:
String hashing is a fundamental operation, used in countless
applications where fast access to distinct strings is required. In
this paper we describe a class of string hashing functions and explore
its performance. In particular, using experiments with both small sets
of keys and a large key set from a text database, we show that it is
possible to achieve performance close to that theoretically predicted
for hashing functions. We also consider criteria for choosing a
hashing function and use them to compare our class of functions to
other methods for string hashing. These results show that our class
of hashing functions is reliable and efficient, and is therefore an
appropriate choice for general-purpose hashing.
Zoltan Somogyi <zs@cs.mu.OZ.AU> http://www.cs.mu.oz.au/~zs/
Department of Computer Science, University of Melbourne, AUSTRALIA
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.