Related articles |
---|
Question on lex's disambiguating rules andrea@eric.mpr.ca (1990-06-21) |
Re: Question on lex's disambiguating rules utoddl@uncecs.edu (1990-06-21) |
Re: Question on lex's disambiguating rules vern@cs.cornell.edu (1990-06-21) |
Re: Question on lex's disambiguating rules rekers@cwi.nl (1990-06-27) |
Newsgroups: | comp.compilers |
From: | andrea@eric.mpr.ca (Jennitta Andrea) |
Date: | Thu, 21 Jun 90 03:33:49 GMT |
Organization: | Microtel Pacific Research Ltd., Burnaby, B.C., Canada |
Keywords: | lex, question |
I would like to know if there is any way to override lex's disambiguating
rules, ie:
"When more than one expression can match the current input, Lex
chooses as follows:
1) The longest match is preferred.
2) Among rules which matched the same number of
characters, the rule given first is preferred"
(from M.E. Lesk and E. Schmidt, "Lex - A Lexical Analyzer Generator",
Computing Science Technical Report 39, Bell Telephone Laboratories,
Murray Hill, NJ, 1975).
I would like to tell lex to 'commit' to recognizing a
specific regular expression, even though a more general regular
expression should be selected based on rule (1) above.
Specifically, given the following macro definitions:
STRING ([^ \t\n]+)
DIGIT ([0-9])
I have two regular expressions:
{D}{D}":"{D}{D}":"{D}{D} { /* recognize "TIMESTAMP" token */ }
{STRING} { /* recognize STRING token */ }
Because my definition of a "STRING" is so general, the following input
stream:
12:30:49AC
is tokenized into a single STRING token ("12:30:49AC"), rather than into a
TIMESTAMP token ("12:30:49") and a STRING token ("AC").
Any suggestions on this question would be greatly appreciated.
Jennitta Andrea | Voice : (604) 293-5362
[How about trailing context after a slash? -John]
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.