Re: '^' and '$' in Regular expression

"mailings@jmksf.com" <mailings@jmksf.com>
Thu, 22 Apr 2010 08:43:57 +0200

          From comp.compilers

Related articles
'^' and '$' in Regular expression march1896@gmail.com (Tangel) (2010-04-21)
Re: '^' and '$' in Regular expression cfc@shell01.TheWorld.com (Chris F Clark) (2010-04-21)
Re: '^' and '$' in Regular expression march1896@gmail.com (Tangel) (2010-04-21)
Re: '^' and '$' in Regular expression mailings@jmksf.com (mailings@jmksf.com) (2010-04-22)
Re: '^' and '$' in Regular expression armelasselin@hotmail.com (Armel) (2010-04-22)
Re: '^' and '$' in Regular expression cfc@shell01.TheWorld.com (Chris F Clark) (2010-04-22)
Re: '^' and '$' in Regular expression quinn_jackson2004@yahoo.ca (Quinn Tyler Jackson) (2010-04-22)
Re: '^' and '$' in Regular expression cfc@shell01.TheWorld.com (Chris F Clark) (2010-04-22)
| List of all articles for this month |

From: "mailings@jmksf.com" <mailings@jmksf.com>
Newsgroups: comp.compilers
Date: Thu, 22 Apr 2010 08:43:57 +0200
Organization: Compilers Central
References: 10-04-052
Keywords: lex, DFA
Posted-Date: 22 Apr 2010 09:20:25 EDT

Hello Tangel,


you have to check for anchors right after you sucessfully matched the
regular expression. The anchors are not part of the NFA state machine
themself. You assign the parsed anchor configuration (BEGIN-OF-LINE,
END-OF-LINE, and to be GNU-compliant, just add some more) to the
accepting state of the NFA. Right after your lib processed the NFA (or
DFA) successfully, you have to check if the anchors match also. That is,
in case of BEGIN-OF-LINE anchor, you check in front of your match if the
preceding character is the begin of the file or a line-break.


I hope this will help you.


-- Jan




On 04/21/2010 09:48 AM, Tangel wrote:
> Dear All:
> I am writing a regexp lib which can be accessed on
> http://code.google.com/p/snev/source/browse/#svn/trunk/tlab/regexp/new.
> It use the theory in dragon book, regexp->NFA->DFA. but when I try to
> match the '^'(begin of line) and '$'(end of line), some problems
> occurs.
> Other symbols except ^ and $ are 'really' characters, they can be
> the weights of the edges, but ^ and $ are not.
> And I try to match ^ and $ as '\n', and add '\n' to the ends of the
> text, but it seems not a prefect solution.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.