Re: symbol tables and search tries

anton@mips.complang.tuwien.ac.at (Anton Ertl)
6 Feb 2006 00:05:25 -0500

          From comp.compilers

Related articles
[19 earlier articles]
Re: symbol tables and search tries DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2006-02-02)
Re: symbol tables and search tries blume@tti-c.org (Matthias Blume) (2006-02-03)
Re: symbol tables and search tries d148f3wg02@sneakemail.com (Karsten Nyblad) (2006-02-03)
Re: symbol tables and search tries henry@spsystems.net (2006-02-03)
Re: symbol tables and search tries DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2006-02-03)
Re: symbol tables and search tries haberg@math.su.se (2006-02-03)
Re: symbol tables and search tries anton@mips.complang.tuwien.ac.at (2006-02-06)
Re: symbol tables and search tries Suif.liyuan@gmail.com (jjyynmb) (2006-02-24)
| List of all articles for this month |
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.compilers
Date: 6 Feb 2006 00:05:25 -0500
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
References: 06-01-085 06-01-111 06-01-112 06-01-127 06-02-008 06-02-017
Keywords: symbols, performance
Posted-Date: 06 Feb 2006 00:05:25 EST

Matthias Blume <blume@tti-c.org> writes:
>If you double your hashtable when the load factor exceeds 1, then you
>waste an expected number of buckets that is bounded by something very
>close to the number of actual elements in the table. If you do not
>double the size, then you have to use more chain links in each bucket.
>In a typical implementation, the required number of extra pointers is
>the same as the number of those wasted buckets.


If you are using chaining, the number of wasted pointers is the number
of buckets: Even if a bucket is used, there is an unused pointer at
the end of the chain. So you can ensure that no more than one pointer
per element is wasted by always keeping the load factor >=1 (e.g., by
doubling when the load factor reaches 2).


Now if we compare this to a binary tree, the tree always wastes one
pointer per element: there are two pointers in each element, and one
pointer to the element, so one pointer is wasted on average.


So, if you can afford the space for a tree, you can also afford the
space for a hash table with chaining that doubles when the load factor
reaches 2.


- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/home.html


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.