# Hashing for Keywords? (was: Minimal Perfect Hashing and Cichelli's algorithm)

## p.froehlich@amc.cube.net (Peter Froehlich)29 Apr 1996 23:31:21 -0400

From comp.compilers

Related articles
Minimal Perfect Hashing and Cichelli's algorithm CSPT@giraffe.ru.ac.za (Pat Terry) (1996-04-10)
Re: Minimal Perfect Hashing and Cichelli's algorithm taj@vanbc.wimsey.com (1996-04-11)
Hashing for Keywords? (was: Minimal Perfect Hashing and Cichelli's alg p.froehlich@amc.cube.net (1996-04-29)
| List of all articles for this month |

 From: p.froehlich@amc.cube.net (Peter Froehlich) Newsgroups: comp.compilers Date: 29 Apr 1996 23:31:21 -0400 Organization: AMIGA CITY - Public Amiga-BBS, Munich, Germany References: 96-04-055 96-04-069 Keywords: symbols

Hi!

> For "large" keyword sets (certainly larger than the keyword sets of most
> programming languages) it may be worthwhile to explore the algorithm
> described in [CHM92] which is a probabilistic method using random graphs
> that works very fast in practice. It also has the benefit of being order
> preserving.

Reading through the mails leading up to the one quoted above I
found myself wondering whether this really is an issue at all when
dealing with keywords.

In fact a lot of Oberon-2 compilers use the following approach
which is *very* simple and apparently not causing a "bottleneck" for
the compiler (Oberon-2 compilers are usually very fast compared to
other compilers, see [1] for an objective comparison):

VAR buffer: ARRAY 256 OF CHAR; token: INTEGER;
(* ... *)
CASE buffer[0] OF
(* ... *)
"A": IF buffer = "ARRAY" THEN
token := array;
ELSE
token := identifier;
END;
|"B": IF buffer = "BEGIN" THEN
token := begin;
ELSIF buffer = "BY" THEN
token := by;
ELSE
token := identifier;
END;
(* ... *)
ELSE
token := identifier;
END;

I can't guarantee for correct syntax here. :-/

I understand the problem for syntax tables where a large number of
varying objects (identifiers) must be expected, but for keywords? Is
it worth the pain to deal with them in a more complex way than the one
outlined above?

----------
[1] Henderson, Zorn: A Comparison of Four Object-Oriented Programming
Languages. Technical Report CU-CS-641-93, University of Colorado at
Boulder, 1993. (availiable through ftp)

By(T)e...
Peter...
--

Post a followup to this message