Related articles |
---|
Question on lex's disambiguating rules andrea@eric.mpr.ca (1990-06-21) |
Re: Question on lex's disambiguating rules utoddl@uncecs.edu (1990-06-21) |
Re: Question on lex's disambiguating rules vern@cs.cornell.edu (1990-06-21) |
Re: Question on lex's disambiguating rules rekers@cwi.nl (1990-06-27) |
Newsgroups: | comp.compilers |
From: | rekers@cwi.nl (Jan Rekers) |
Keywords: | lex, question |
Organization: | CWI, Amsterdam |
References: | <1990Jun21.033349.2983@esegue.segue.boston.ma.us> |
Date: | Wed, 27 Jun 90 16:07:39 GMT |
In article <1990Jun21.033349.2983@esegue.segue.boston.ma.us>,
andrea@eric.mpr.ca (Jennitta Andrea) writes:
|>I have two regular expressions:
|>
|>{D}{D}":"{D}{D}":"{D}{D} { /* recognize "TIMESTAMP" token */ }
|>{STRING} { /* recognize STRING token */ }
|>
|>Because my definition of a "STRING" is so general, the following input
|>stream:
|>
|> 12:30:49AC
|>
|>is tokenized into a single STRING token ("12:30:49AC"), rather than into a
|>TIMESTAMP token ("12:30:49") and a STRING token ("AC").
The most general solution to this problem would be to allow multiple lexical
channels, which are fed to a parser which can split up (like the Tomita
algorithm can for example).
On input 12:30:49AC the lexer returns two token streams:
(timestamp: 12:30:49)
(string: 12:30:49AC)
The parser splits up in a parser for each possibility which each obtain
an own lexical channel. The next tokens in the channels decide which
of the parsers wins.
This is a quite general (and inefficient) solution, which can also be used to
solve lexing and parsing for FORTRAN in a very neat manner.
We consider to implement the above solution; if anybody knows more about
it, please let us know...
Jan Rekers (rekers@cwi.nl) Centre for Mathematics and Computer Science
P.O. Box 4079, 1009 AB Amsterdam, The Netherlands
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.