Related articles |
---|
[2 earlier articles] |
Re: Tokenizer theory and practice mailbox@dmitry-kazakov.de (Dmitry A. Kazakov) (2008-05-16) |
Re: Tokenizer theory and practice DrDiettrich1@aol.com (Hans-Peter Diettrich) (2008-05-17) |
Re: Tokenizer theory and practice haberg_20080406@math.su.se (Hans Aberg) (2008-05-17) |
Re: Tokenizer theory and practice DrDiettrich1@aol.com (Hans-Peter Diettrich) (2008-05-17) |
Re: Tokenizer theory and practice cr88192@hotmail.com (cr88192) (2008-05-18) |
Re: Tokenizer theory and practice cr88192@hotmail.com (cr88192) (2008-05-18) |
Re: Tokenizer theory and practice mailbox@dmitry-kazakov.de (Dmitry A. Kazakov) (2008-05-18) |
Re: Tokenizer theory and practice DrDiettrich1@aol.com (Hans-Peter Diettrich) (2008-05-18) |
Re: Tokenizer theory and practice cr88192@hotmail.com (cr88192) (2008-05-20) |
From: | "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> |
Newsgroups: | comp.compilers |
Date: | Sun, 18 May 2008 10:29:59 +0200 |
Organization: | cbb software GmbH |
References: | 08-05-050 08-05-066 08-05-069 |
Keywords: | lex |
Posted-Date: | 19 May 2008 21:26:18 EDT |
On Sat, 17 May 2008 10:22:34 +0200, Hans-Peter Diettrich wrote:
> Dmitry A. Kazakov schrieb:
>
>> When I do similar stuff, I do it in a way that the parser returned
>> typed objects rather than copies of the source. The whole idea to
>> copy the source is bogus, IMO.
>
> Indeed, textual copies are of little use. Can you suggest a
> descriptive formalism for the objects, returned by an lexer?
Not with a bottom-up approach. But when parser does it top-down or
else somewhere in the middle, it well knows what to expect at the
cursor. Being at the top it knows the exact type, so that parsing
either fails or yields a token. Below that it knows only some set of
types, i.e. in OO terms, a class of types. In this case the returned
token would be a polymorphic object from that class (or else a
failure). The class could be like "infix operation","literal" etc. In
fact, this is merely the abstract factory pattern. The parser acts a
factory, the parsed source at the cursor determines the concrete token
type and then its value.
I think this could be formalized. One premise is that the set of
tokens forms a tree/forest-like hierarchy, which is, I believe, almost
always the case.
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
Return to the
comp.compilers page.
Search the
comp.compilers archives again.