Re: Hash specifics

roth@resi.rice.edu (Jerry Roth)
Fri, 14 Dec 90 16:44:15 GMT

          From comp.compilers

Related articles
[4 earlier articles]
Re: Hash specifics henry@zoo.toronto.edu (1990-12-12)
Re: Hash specifics baxter@zola.ICS.UCI.EDU (Ira Baxter) (1990-12-13)
Re: Hash specifics lupine!rfg@uunet.UU.NET (1990-12-13)
Re: Hash specifics rfg@ncd.com (1990-12-13)
Re: Hash specifics rfg@ncd.com (1990-12-13)
Re: Hash specifics bart@cs.uoregon.edu (1990-12-13)
Re: Hash specifics roth@resi.rice.edu (1990-12-14)
Re: Hash specifics oz@nexus.yorku.ca (1990-12-14)
Re: Hash specifics plx!plxsun!evan@Sun.COM (1990-12-15)
Re: Hash specifics vern@horse.ee.lbl.gov (1990-12-16)
Re: Hash specifics schmidt@oberkampf.ICS.UCI.EDU (Doug Schmidt) (1990-12-17)
Re: Hash specifics preston@ariel.rice.edu (1990-12-17)
Re: Hash specifics brain@eos.ncsu.edu (1990-12-18)
[1 later articles]
| List of all articles for this month |
Newsgroups: comp.compilers
From: roth@resi.rice.edu (Jerry Roth)
Keywords: design, lex, scanner
Organization: Rice University, Houston
References: <1990Dec12.221425.569@zoo.toronto.edu> <9012122107.aa06454@ICS.UCI.EDU> <1990Dec13.201211.6483@cs.uoregon.edu>
Date: Fri, 14 Dec 90 16:44:15 GMT

In article <1990Dec13.201211.6483@cs.uoregon.edu> bart@cs.uoregon.edu (Barton Christopher Massey) writes:
>You can very efficiently decide whether the next input belongs to a given
>set of keywords or is an identifier as you read it in using a DFA or some
>similar recognizer. Given the wide availability of FLEX, this is the
>approach I almost always use -- dodges both the problems of efficient
>keyword hashing and separating identifiers from keywords.


I have always wondered which method is truly more efficient. The two
methods are:


(a) Preload all keywords into the hash table and have the lexer distinguish
identifiers from keywords via lookup. Keywords and identifiers are then
treated the same until after the lookup.


(b) Add a regular expression for each keyword to the lexer followed by a
regular expression to catch identifiers. Keywords are then handled directly
while only the identifiers get hashed.


It seems method (a) has the advantage of keeping the lexer simple in that it
has fewer expressions to attempt to match. While method (b) avoids hashing
keywords.


Can anyone tell me if adding more expressions to my lexer (one per keyword)
causes it to execute less efficiently, assuming that I am using LEX/FLEX to
create it. If so, is it less efficient than hashing each keyword when
encountered.


Thanks,
Jerry Roth
[It is my impression that the speed at which a flex parser does what it does
is unrelated to the complexity of the state machine, although its size goes
up, but I'd be glad to hear from someone who knows definitively. -John]
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.