Re: Keyword implementation.

"Matthew J. Lockner" <lockner@chaos.cns.uni.edu>
1 Jul 2001 23:44:12 -0400

From comp.compilers

Related articles
Keyword implemetation. tfjellstrom@home.com (Thomas Fjellstrom) (2001-06-28)
*Re: Keyword implementation. lockner@chaos.cns.uni.edu (Matthew J. Lockner)* (2001-07-01)**
Re: Keyword implementation. acolvin@bbn.com (Alex Colvin) (2001-07-02)

| List of all articles for this month |

From:	"Matthew J. Lockner" <lockner@chaos.cns.uni.edu>
Newsgroups:	comp.compilers
Date:	1 Jul 2001 23:44:12 -0400
Organization:	University of Northern Iowa
References:	01-06-079
Keywords:	lex, comment
Posted-Date:	01 Jul 2001 23:44:12 EDT

If you're building keyword recognition into the lexer, then the
approach I prefer is to use GNU gperf
(http://www.gnu.org/software/gperf/gperf.html). Its advantages are
manyfold, but I enjoy the laziness of giving the tool a list of
keywords and having it spit out a testing function (identifier
is/isn't in the set).

gperf builds a "minimal perfect" hash function for you. The
documentation has considerable digression on exactly what this means,
but suffice it to say it is a very fast hash function. Of course, the
usual phrase "mileage may vary" most likely applies, and I think code
profiling would be the only good way to reveal whether bsearch can do
better for a given keyword set, etc. I would guess that gperf will
still be less work, though.

My two cents, anyway.

M. Lockner

Thomas Fjellstrom wrote:

> In a compiler/translator I've been writing I implemented
> the Reserved/Key words with a sorted table and searched
> using bsearch. I was wondering what every ones preference
> was for implementing keywords? using the symbol table,
> a method just like mine or something else all togeter?
[We discussed minimal perfect hashing at great length some years ago, and
I don't recall that perfect hash functions were all that fast. It seems
to me that on average a separate hash table for keywords will lose compared
to a single combined hash-indexed symbol table, since every non-keyword is
looked up twice with two hashes. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Keyword implementation.

"Matthew J. Lockner" <lockner@chaos.cns.uni.edu>1 Jul 2001 23:44:12 -0400

"Matthew J. Lockner" <lockner@chaos.cns.uni.edu>
1 Jul 2001 23:44:12 -0400