|[9 earlier articles]|
|Re: Looking for symbol table hashing algorithm email@example.com (1998-07-08)|
|Re: Looking for symbol table hashing algorithm firstname.lastname@example.org (Dave Hanson) (1998-07-08)|
|Re: Looking for symbol table hashing algorithm email@example.com (1998-07-10)|
|Re: Looking for symbol table hashing algorithm firstname.lastname@example.org (1998-07-10)|
|Re: Looking for symbol table hashing algorithm email@example.com (1998-07-10)|
|Re: Looking for symbol table hashing algorithm firstname.lastname@example.org (David Chase) (1998-07-10)|
|Re: Looking for symbol table hashing algorithm email@example.com (Paul Dietz) (1998-07-11)|
|Re: Looking for symbol table hashing algorithm firstname.lastname@example.org (Scott David Daniels) (1998-07-13)|
|Re: Looking for symbol table hashing algorithm email@example.com (1998-07-13)|
|Re: Looking for symbol table hashing algorithm firstname.lastname@example.org (Peter Wilson) (1998-07-20)|
|From:||Paul Dietz <email@example.com>|
|Date:||11 Jul 1998 23:40:49 -0400|
|Organization:||The World's Usenet -- http://www.Supernews.com|
|References:||98-07-030 98-07-045 98-07-084|
David Chase wrote:
> What this means is that the bucket chains must get really long
> before chaining is costly, and with a grade A hash function, the
> odds of that happening are very low indeed. I lack the skill-time
> product to generally compute odds, but if you've got 16 bucket
> heads, the odds against adding 16 entries and landing them all in
> the same bucket are (I think) 2**60 to 1.
To make chaining have good expected time on any input set it suffices
to choose a random hash function such that the probability that two
different keys hash to the same location is <= 1/(table size). That's
because the time to insert n keys will be proportional to the
expectation of the sum of the n^2 random variables C(i,j), where
C(i,j) = 1 if keys i and j collide, and 0 if they do not. Expectation
of a sum of RVs is the sum of expectations, even if the RVs are not
Universal hash functions are covered in Cormen, Leiserson and Rivest.
> For scoped symbol tables, there's a stunt, I think due to Paul Dietz,
> that is useful in some applications.
> Rather than modify the symbol table to reflect the changing scopes
> as compilation moves through the program, you simply note which
> scope you are in; a reference to a symbol X takes the form "get
> scopes from symbol table for X, find current scope in set of
> scopes". Numbering the scopes with in and out numbers in a standard
> tree walk generates bracketing ranges that allow you to find the
> nearest enclosing scope in log N time (binary search).
What I did was solve this problem in the case of a tree that is
growing by addition of individual nodes, so one can't use a static
numbering. The original paper had a log* n extra factor, which Tarjan
immediately (I mean, seconds after the presentation) pointed out how
to avoid (see also the paper by Tsakalidis.)
You can do the lookup in O(loglogn) time on a RAM with O(logn) word
size using the van Emde Boas integer set data structure. Probably not
terribly useful practically.
Return to the
Search the comp.compilers archives again.