Re: '^' and '$' in Regular expression

"Armel" <>
Thu, 22 Apr 2010 17:11:33 +0200

          From comp.compilers

Related articles
'^' and '$' in Regular expression (Tangel) (2010-04-21)
Re: '^' and '$' in Regular expression (Chris F Clark) (2010-04-21)
Re: '^' and '$' in Regular expression (Tangel) (2010-04-21)
Re: '^' and '$' in Regular expression ( (2010-04-22)
Re: '^' and '$' in Regular expression (Armel) (2010-04-22)
Re: '^' and '$' in Regular expression (Chris F Clark) (2010-04-22)
Re: '^' and '$' in Regular expression (Quinn Tyler Jackson) (2010-04-22)
Re: '^' and '$' in Regular expression (Chris F Clark) (2010-04-22)
| List of all articles for this month |

From: "Armel" <>
Newsgroups: comp.compilers
Date: Thu, 22 Apr 2010 17:11:33 +0200
Organization: Compilers Central
References: 10-04-052
Keywords: lex, DFA
Posted-Date: 22 Apr 2010 12:52:56 EDT

"Tangel" <> a icrit dans le message de news:
> I am writing a regexp lib which can be accessed on
> It use the theory in dragon book, regexp->NFA->DFA. but when I try to
> match the '^'(begin of line) and '$'(end of line), some problems
> occurs.
> Other symbols except ^ and $ are 'really' characters, they can be
> the weights of the edges, but ^ and $ are not.
> And I try to match ^ and $ as '\n', and add '\n' to the ends of the
> text, but it seems not a prefect solution.

It cannot work that way because the DFA is expected to contain only _real_
characters, not constraints. for example, you won't be able either to make
sub expressions or positive/negative look-ahead either. Sub expressions in
bare DFA are possible only in a subset of the cases that you generally want
to represent in regular expressions.
for the sub-expressions, DFA cannot express a proper limit between what is
inside and what is outside in general case (it is not possible to put
'go-in/go-out' labels on transitions or states because they can mix RE-nodes
inside and outside, in particular because of nodes jointure based only on
what can follow)
for the constraints, you cannot because a bare DFA as explained in Dragon
book cannot match something without progressing in the text: one transition
= one advance.

you might be able to code contraints though by using attributed labels (e.g.
'^a' would a different transition then 'a'), you then would have to define
carefully if you can and how to join transitions not sharing same
attributes, in addition, you might use go-in/go-out notations on the
first/end transitions on constraints by enforcing your user to use only
non-empty, non-nesting constraints and keeping a memory were last opened
constraint started.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.