Re: Leftmost longest match with DFA search

Danny.Dube@ift.remove.ulaval.remove.ca (=?iso-8859-1?q?Danny_Dub=E9?=)
15 May 2008 11:58:45 -0400

          From comp.compilers

Related articles
Leftmost longest match with DFA search monnier@iro.umontreal.ca (Stefan Monnier) (2008-05-10)
Re: Leftmost longest match with DFA search daniel2villeneuve@videotron.ca (Daniel Villeneuve) (2008-05-11)
Re: Leftmost longest match with DFA search rsc@swtch.com (Russ Cox) (2008-05-12)
Re: Leftmost longest match with DFA search monnier@iro.umontreal.ca (Stefan Monnier) (2008-05-13)
Re: Leftmost longest match with DFA search monnier@iro.umontreal.ca (Stefan Monnier) (2008-05-13)
Re: Leftmost longest match with DFA search Danny.Dube@ift.ulaval.ca (2008-05-13)
Re: Leftmost longest match with DFA search rsc@swtch.com (Russ Cox) (2008-05-14)
Re: Leftmost longest match with DFA search Danny.Dube@ift.remove.ulaval.remove.ca (2008-05-15)
| List of all articles for this month |

From: Danny.Dube@ift.remove.ulaval.remove.ca (=?iso-8859-1?q?Danny_Dub=E9?=)
Newsgroups: comp.compilers
Date: 15 May 2008 11:58:45 -0400
Organization: Compilers Central
References: 08-05-026 08-05-029 08-05-046
Keywords: lex, DFA
Posted-Date: 15 May 2008 20:13:09 EDT

Stefan Monnier <monnier@iro.umontreal.ca> writes:


[snip]
> >> The "obvious" solution of turning the problem "search for RE" into the
> >> problem "match .*RE" (where I use "match" here to mean "anchored
> >> search") only gives you the leftmost shortest match.
> > [snip]
> >> Stefan
>
> > I've used the approach to compile a DFA for the reverse RE, say ER, and
> > first match .*ER on the reverse text to find the leftmost anchor point.
> > Then match RE from that point to find the longest span.
>
> Interesting. But doesn't it basically force you to scan the complete text?
> That can be impractical.


You don't (always) need to scan the complete text. You can perform
that reversed search on increasingly longer prefixes of the text. If
you increase the lengths of the prefixes by a constant factor, you
work for a time in O(K) to find a leftmost match when the latter
completely lies in the first K characters.


Danny



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.