Re: Looking for symbol table hashing algorithm

khays@sequent.com (Kirk Hays)
5 Jul 1998 21:35:21 -0400

          From comp.compilers

Related articles
Looking for symbol table hashing algorithm wal@cosc.canterbury.ac.nz (Warwick Irwin) (1998-07-03)
Re: Looking for symbol table hashing algorithm dwight@pentasoft.com (1998-07-03)
Re: Looking for symbol table hashing algorithm drunelson@my-dejanews.com (1998-07-03)
Re: Looking for symbol table hashing algorithm jmccarty@sun1307.spd.dsccc.com (1998-07-03)
Re: Looking for symbol table hashing algorithm zs@cs.mu.OZ.AU (1998-07-05)
Re: Looking for symbol table hashing algorithm chase@naturalbridge.com (David Chase) (1998-07-05)
Re: Looking for symbol table hashing algorithm dietz@interaccess.com (Paul Dietz) (1998-07-05)
Re: Looking for symbol table hashing algorithm khays@sequent.com (1998-07-05)
Re: Looking for symbol table hashing algorithm eodell@pobox.com (1998-07-08)
Re: Looking for symbol table hashing algorithm preston@tera.com (1998-07-08)
Re: Looking for symbol table hashing algorithm drh@microsoft.com (Dave Hanson) (1998-07-08)
Re: Looking for symbol table hashing algorithm smith@aragorn.uni-trier.de (1998-07-10)
Re: Looking for symbol table hashing algorithm scott@basis.com (1998-07-10)
Re: Looking for symbol table hashing algorithm henry@spsystems.net (1998-07-10)
[5 later articles]
| List of all articles for this month |

From: khays@sequent.com (Kirk Hays)
Newsgroups: comp.compilers
Date: 5 Jul 1998 21:35:21 -0400
Organization: Sequent Computer Systems, Inc.
References: 98-07-030 98-07-042
Keywords: symbols

Zoltan Somogyi <zs@cs.mu.OZ.AU> wrote:
>>[My impression is that so long as you avoid obviously horrible hashes
>>that do things like making all single letter symbols collide, it
>>doesn't make much difference. -John]
>
>That would be true, except that the set of "obviously horrible" hashes
>is quite large. Real data sets exhibits several kinds of similarity
>between strings to be hashed, and designing an algorithm that
>successfully deals with *all* these similarities is a non-trivial
>exercise.


I'll put in a plug for using AVL trees (or other balanced trees) at
this point - they don't show degenerate performance on any case, and
performance is good on real-world (and machine generated) codes, in my
experience.


In particular, having had to write compilers that had to deal with a
lot of machine-generated code, I will forevermore avoid hashing as a
technique where the input data is unknown.


For any given hash, it is possible to design a worst-case input.


Murphy states you will encounter that input.
--
Kirk Hays
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.