Generating hash tables from known input (I think) (Robert Virding)
Mon, 10 May 1993 17:07:18 GMT

          From comp.compilers

Related articles
Generating hash tables from known input (I think) (1993-05-10)
Re: Generating hash tables from known input (I think) (1993-05-11)
Re: Generating hash tables from known input (1993-05-17)
Re: Generating hash tables from known input (bibliography) (1993-05-21)
Re: Generating hash tables from known input (I think) (1993-05-24)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Robert Virding)
Keywords: question, comment
Organization: Ellemtel Telecom Systems Labs, Stockholm, Sweden
Date: Mon, 10 May 1993 17:07:18 GMT

The functional language Erlang which we have designed, implemented and
used (see blurb at the end of the message) has pattern matching. I would
now like optimise the implementation of this pattern matching by adding
indexing. The algorithm I am now using (freely translated from Simon P.J.s
book) handles the breaking down of patterns into different data types and
splitting up compound types, so I am left with the problem of indexing
constants. We have both numeric constants (integers and floats) and
symbolic constants (atoms).

The problem, as I see it, is how to build an optimal hash table from known
set of hash values, and how to generate these hash values. Whether
collisions are handled by rehashing or buckets is irrelevent from my point
of view. Can anyone help me with suggestions on the how to do it or
suitable references? Is this the best method?

A second question: what are acceptable values for the maximum and average
lengths of collisions chains? I have done some experiments and it would be
intersting know I if I have got good results or not.

Thanks for any help,


-------------------- The Blurb --------------------

Erlang - "Concurrent Programming in Erlang", J. Armstrong, M. Williams
& R. Virding, Prentice Hall, 1993. ISBN 13-285792-8.

        Classification: Concurrent functional programming language for large
industrial real-time systems. Untyped. Pattern matching syntax.
Recursion equations. Explicit concurrency, asynchronous message passing.
Relatively free from side effects. Transparent cross-platform
distribution. Primitives for detecting run-time errors. Real-time GC.
Modules. Dynamic code replacement (change code in running real-time
system, without stopping system). Foreign language interface.

        Availability: Free version (subject to non-commercial licence) with no
support. Commercial versions with support are available (Erlang Systems


FTP Info:
[Given a fixed set of tokens, it is often possible to construct a ``perfect''
hashing function that hashes each of the tokens into a unique bucket in a
dense table, although constructing the function is invariably quite slow.
Usually one can get adequate performance with a sufficiently large hash
table (several times more slots than entries) and any old hash function that
avoids degenerate cases. What's "adequate" depends on the rest of the
application; adequate usually means that it doesn't contribute enough to
the total runtime to be worth reworking. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.