# Re: Looking for symbol table hashing algorithm

## Paul Dietz <dietz@interaccess.com>

5 Jul 1998 21:33:01 -0400

*From comp.compilers*

| List of all articles for this month |

**From: ** | Paul Dietz <dietz@interaccess.com> |

**Newsgroups: ** | comp.compilers |

**Date: ** | 5 Jul 1998 21:33:01 -0400 |

**Organization: ** | The World's Usenet -- http://www.Supernews.com |

**References: ** | 98-07-030 98-07-045 |

**Keywords: ** | symbols |

*> Warwick Irwin wrote:*

*> >*

*> > Can anyone point me to a reference for an industrial-strength hashing*

*> > algorithm suitable for implementing a symbol table in a C++ parser?*

*> > I would like something that has been tuned for large C++ programs.*

David Chase wrote:

*> Do you want a hash function, or a hashing algorithm?*

*>*

*> There was an article, I think in CACM, that spoke well of a*

*> table-based algorithm. Simply fill an array of 256 integers with*

*> random numbers, and use an algorithm along the lines of:*

[deleted]

He might want to focus more on algorithms that handle several bytes at

a time. Arrange the identifiers to be long word aligned (and,

preferably, aligned with the start of a cache line) and padded with

null bytes to an integral number of long words.

A hash function like

h(x_1 ... x_n) = (a_1 x_1 + ... + a_n x _n) mod m

(where the a_i are randomly chosen constants and x_i are the long

words of the identifier) might be efficiently computed if

multiplication were fast or were pipelined.

Paul

[Is the hash really likely to be a hot spot in the compiler? Also, it is

my distinct impression that most identifiers are pretty short in languages

like C, C++, and Fortran. -John]

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.