Re: symbol tables and search tries

haberg@math.su.se (Hans Aberg)
31 Jan 2006 00:37:22 -0500

          From comp.compilers

Related articles
[2 earlier articles]
Re: symbol tables and search tries gah@ugcs.caltech.edu (glen herrmannsfeldt) (2006-01-30)
Re: symbol tables and search tries haberg@math.su.se (2006-01-30)
Re: symbol tables and search tries anton@mips.complang.tuwien.ac.at (2006-01-31)
Re: symbol tables and search tries RLake@oxfam.org.uk (2006-01-31)
Re: symbol tables and search tries vmakarov@redhat.com (Vladimir Makarov) (2006-01-31)
Re: symbol tables and search tries clint@0lsen.net (Clint Olsen) (2006-01-31)
Re: symbol tables and search tries haberg@math.su.se (2006-01-31)
Re: symbol tables and search tries henry@spsystems.net (2006-01-31)
Re: symbol tables and search tries find@my.address.elsewhere (Matthias Blume) (2006-01-31)
Re: symbol tables and search tries danwang74@gmail.com (Daniel C. Wang) (2006-01-31)
Re: symbol tables and search tries mailbox@dmitry-kazakov.de (Dmitry A. Kazakov) (2006-01-31)
Re: symbol tables and search tries DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2006-01-31)
Re: symbol tables and search tries henry@spsystems.net (2006-01-31)
[12 later articles]
| List of all articles for this month |

From: haberg@math.su.se (Hans Aberg)
Newsgroups: comp.compilers
Date: 31 Jan 2006 00:37:22 -0500
Organization: Mathematics
References: 06-01-085 06-01-111 06-01-117
Keywords: symbols, comment
Posted-Date: 31 Jan 2006 00:37:22 EST

John Levine wrote:


> I make my hash tables big enough that the chains are short, i.e.,
> design so that k>=N so it's O(1). Hash headers need only be one word
> pointers, so pick a number larger than the number of symbols you
> expect in a large program, say, 10,000, and use that. So it adds 40K
> to the size of the program, that's down in the noise these days.


Even though one in a practical application can choose k large enough,
hoping that somebody will not choose a larger N, from the theoretical
point, when computing the complexity order, N is not limited. So, the
adjustment of k must be taken into account when computing the
complexity order. Then, by doubling k, one gets down to logarithmic
time. I also think one cannot do any better, because the best sorting
algorithm, over all choices is O(N log N), so if one has better
inserting complexity in a container than O(log N), one can beat that
by merely insert elements one by one.


But I am not interested in the best complexity, but to figure out why
a hash map apparently has better value to compiler writers than a
balanced tree. This could be, an email I got suggests, that a compiler
perhaps uses a lot of lookups, and relatively few inserts. Then, the
lookup time on hash table proportional to the average depth of the
linked lists, so if that is close to 1, one probably can get very fast
lookups.


--
    Hans Aberg
[In the compilers I use, there's an insert the first time you see a
symbol in the source and a lookup every time you see it subsequently.
Unless you have a lot of symbols you only use once, I'd expect lookups
to dominate. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.