|Re: LEX behavior when given "large" automata. haddock!uunet!uiucdcs!pur-ee!hankd (1988-03-22)|
|Re: LEX behavior when given "large" automata. harvard!rutgers!mandrill.cwru.edu!chet@BBN.COM (Chet Ramey) (1988-03-23)|
|Re: LEX behavior when given "large" automata. beres@cadnetix.COM (1988-03-24)|
|From:||haddock!uunet!uiucdcs!pur-ee!hankd (Hank Dietz)|
|Summary:||Standard Practice... what is it called?|
|Date:||22 Mar 88 20:20:44 GMT|
|References:||<911@ima.ISC.COM> <914@ima.ISC.COM> <917@ima.ISC.COM>|
|Organization:||Purdue University Engineering Computer Network|
In article <917@ima.ISC.COM>, email@example.com (Tony Li) writes:
> In fact, another cute trick is to toss in a simple hashing function.
> Unless you've got lots of keywords, you usually can get away with
> doing only one strcmp.
I'm very pleased to see many people confirming that what I've
done and told my students to do is reasonably widely accepted
(despite not appearing in any compiler textbook I know of)...
recognizing keywords and identifiers by a single DFA rule and
then using symbol table lookup techniques to determine the
type of the lexeme.
My question is simply: what is this technique officially
called and does anyone know of a formal reference for it?
Since the Compilers Course Notes I wrote back at Polytechnic
University in 1983, I've been refering to it as "atomic
lexical analysis" because it closely resembles the way in
which Lisp recognizes atoms and then looks 'em up to determine
their type... but that's just my name for it.
[I've never seen it called anything, most likely because it's only recently
that automatic DFA generators have made it possible to do tokenizing any
other way. -John]
Return to the
Search the comp.compilers archives again.