Related articles |
---|
Re: Wanted: Regular Expression -> Finite Automata C code =- tdh6t@helga3.acc.virginia.edu (Todd Hodes) (1993-04-05) |
Re: Wanted: Regular Expression -> Finite Automata C code =- megatest!plethorax!djones@uu2.psi.com (1993-04-09) |
Re: Wanted: Regular Expression -> Finite Automata C code =- henry@zoo.toronto.edu (1993-04-12) |
Re: Wanted: Regular Expression -> Finite Automata C code =- tdh6t@onyx.cs.virginia.edu (Todd Hodes) (1993-04-14) |
Newsgroups: | comp.compilers |
From: | henry@zoo.toronto.edu (Henry Spencer) |
Keywords: | DFA, lex |
Organization: | U of Toronto Zoology |
References: | 93-04-023 93-04-041 |
Date: | Mon, 12 Apr 1993 19:50:34 GMT |
megatest!plethorax!djones@uu2.psi.com (Dave Jones) writes:
>Also, every time a final state is encountered, you will want to remember
>the number of characters that have been processed at that point. The last
>number remembered will be the length of the longest match.
Unfortunately, this doesn't generalize to most real-life applications,
where the match is not anchored at the start of the string. It's easy to
generalize the matching algorithm so you still make only one pass, by
considering the possibility of a fresh start at each character, and this
is definitely the way to do it -- you don't want to retry the match at
each possible starting position! Alas, the bookkeeping for longest match
falls apart. You don't know when the particular match that corresponds to
a new final state started. Consider the RE "(abcde|cd|ef)" and the string
"xabcdefy". You reach a final state after seeing the "d", another after
the "e", and another after the "f", but the last one is not the longest.
--
Henry Spencer @ U of Toronto Zoology, henry@zoo.toronto.edu utzoo!henry
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.