Related articles |
---|
Question on lex's disambiguating rules andrea@eric.mpr.ca (1990-06-21) |
Re: Question on lex's disambiguating rules utoddl@uncecs.edu (1990-06-21) |
Re: Question on lex's disambiguating rules vern@cs.cornell.edu (1990-06-21) |
Re: Question on lex's disambiguating rules rekers@cwi.nl (1990-06-27) |
Newsgroups: | comp.compilers |
From: | vern@cs.cornell.edu (Vern Paxson) |
References: | <1990Jun21.033349.2983@esegue.segue.boston.ma.us> |
Date: | Thu, 21 Jun 90 18:44:46 GMT |
Organization: | Cornell Univ. CS Dept, Ithaca NY |
Keywords: | lex, question |
In article <1990Jun21.033349.2983@esegue.segue.boston.ma.us> andrea@eric.mpr.ca (Jennitta Andrea) writes:
> I would like to know if there is any way to override lex's disambiguating
> rules ...
> STRING ([^ \t\n]+)
> DIGIT ([0-9])
>
> I have two regular expressions:
>
> {D}{D}":"{D}{D}":"{D}{D} { /* recognize "TIMESTAMP" token */ }
>
> {STRING} { /* recognize STRING token */ }
>
> Because my definition of a "STRING" is so general, the following input
> stream:
>
> 12:30:49AC
>
> is tokenized into a single STRING token ("12:30:49AC"), rather than into a
> TIMESTAMP token ("12:30:49") and a STRING token ("AC").
>
> Any suggestions on this question would be greatly appreciated.
and John adds
> [How about trailing context after a slash? -John]
Yes, with both lex and flex the text matched by trailing context is included
in the length of the matched token, so a rule like
{D}{D}":"{D}{D}":"{D}{D}/{STRING} { /* recognize "TIMESTAMP" token */ }
will do the trick. To be completely correct, one needs to also worry about
matching the timestamp *without* a following string, as the timestamp may
be followed by whitespace (or an end-of-file). To do this, use:
{D}{D}":"{D}{D}":"{D}{D}/({STRING}?) { /* recognize "TIMESTAMP" token */ }
Vern
Vern Paxson vern@cs.cornell.edu
Computer Science Dept. decvax!cornell!vern
Cornell University
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.