Re: C Hashmap implementation

"cr88192" <>
26 Apr 2007 09:35:06 -0400

          From comp.compilers

Related articles
C Hashmap implementation (bison) (2007-04-23)
Re: C Hashmap implementation (Vladimir Makarov) (2007-04-25)
Re: C Hashmap implementation (cr88192) (2007-04-26)
Re: C Hashmap implementation (bison) (2007-04-26)
Re: C Hashmap implementation (Christopher Diggins) (2007-04-26)
Re: C Hashmap implementation (George Neuner) (2007-04-26)
Re: C Hashmap implementation (Gene) (2007-04-26)
Re: C Hashmap implementation (Bernhard Roessmann) (2007-04-26)
Re: C Hashmap implementation (Hans-Peter Diettrich) (2007-04-27)
[2 later articles]
| List of all articles for this month |

From: "cr88192" <>
Newsgroups: comp.compilers
Date: 26 Apr 2007 09:35:06 -0400
Organization: Saipan Datacom
References: 07-04-089
Keywords: symbols
Posted-Date: 26 Apr 2007 09:35:06 EDT

"bison" <> wrote in message
> Hello. I'm looking into hashmap implementations for a VM based
> dynamically typed language. Since the implementation must allow me to
> resize on demand, I am a bit confused about the various approaches.
> Here is what I have found out so far:
> Almost everyone I've talked to has said that Chained Hashmaps are much
> easier to implement than Open Addressed maps.

I would agree here.
there are some cases where they make a lot more sense as well.

> Wikipedia suggests that an approach to resizing hashmaps is to
> allocate space for a newer hashmap and copy elements from to the new
> table, and in some cases do it incrementally.

but then you have 2 hashes, one of which is around until the old one is

sometimes one can resize the table though, and use a gradiated scheme:
first try the largest hash;
if failed, try the smaller hash;

when done copying the entry to the largest hash.
though not tried personally, this could have contention issues though (esp
if one uses a permutative lookup scheme, where potentially removing an entry
could eliminate existing entries, implying that one can't simply stop when
they encounter an empty slot, but must first check some n permutations).

> Quick question about the last point: I'm curious about a starting
> point. How much space should a hashmap allocate initially, and when
> it's how much would I typically increase it? I realize
> there are lots of different answers here, so a good starting point
> would really help out.

for resizable things, usually I start out with a low-max guestimate, say 256
to 4096.
usually I expand in increments of 50%, so:

in practice, this seems a fairly good factor (avoids extra expansions while
still usually keeping a fairly tight fit on the data, vs say, doubling the
size each time).

if being more conservative, a 4/3 factor might be better than a 3/2 factor,
but I don't know...

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.