Re: Precedence Rules for '$' and '^'

Joachim Durchholz <jo@durchholz.org>
Sat, 15 Sep 2007 22:41:25 +0200

From comp.compilers

Related articles
Precedence Rules for '$' and '^' jamin.hanson@googlemail.com (2007-09-12)
Re: Precedence Rules for '$' and '^' jo@durchholz.org (Joachim Durchholz) (2007-09-13)
Re: Precedence Rules for '$' and '^' jo@durchholz.org (Joachim Durchholz) (2007-09-13)
Re: Precedence Rules for '$' and '^' jamin.hanson@googlemail.com (2007-09-14)
Re: Precedence Rules for '$' and '^' rsc@swtch.com (Russ Cox) (2007-09-14)
*Re: Precedence Rules for '$' and '^' jo@durchholz.org (Joachim Durchholz)* (2007-09-15)**
Re: Precedence Rules for '$' and '^' cfc@shell01.TheWorld.com (Chris F Clark) (2007-09-17)
Re: Precedence Rules for '$' and '^' jamin.hanson@googlemail.com (2007-09-17)

| List of all articles for this month |

From:	Joachim Durchholz <jo@durchholz.org>
Newsgroups:	comp.compilers
Date:	Sat, 15 Sep 2007 22:41:25 +0200
Organization:	1&1 Internet AG
References:	07-09-035 07-09-037 07-09-048 07-09-052
Keywords:	lex, design
Posted-Date:	16 Sep 2007 15:24:52 EDT

jamin.hanson@googlemail.com schrieb:
> However, the real question is that if you allow '^' and '$' to occur
> anywhere in a regex (boost::regex works this way),

I may be missing something, but it seems to me that such a rule
wouldn't match anything if it has a nonempty pattern before the ^ or
after the $.

I.e. asd^jkl, while a perfectly valid regex, will never match, or will it?

> how you handle '^'
> and '$' clashes, because you may have declared a '$' rule before a '^'
> rule, yet my code always checks '^' before '$' regardless.

Some example would help. Rule order definitely doesn't affect lexing,
at least not unless you provide mechanisms that go beyond regular
expressions.

> As you have to check both possibilities on lookup (otherwise how can
> you ever match them ;-) ), the right thing to do appears to be to
> suppress the '^' if it occurs at a position in the rules that a '$'
> has already occurred at.

In my book, the Right Thing would be to spit out a warning somewhere.

> Note that using PERL rules is not the answer, as lexers use left-most
> longest and compile to a DFA. PERL uses leftmost precedence and uses
> NFA.

I'm not sure how "leftmost longest" and "leftmost precedence" relate.
DFA and NFA are certainly not at odds, since they are equivalent.

I'd still look at Perl whenever I'm unsure, simply because that's how
most people expect regexes to work. That doesn't mean you have to do
everything as Perl does, particularly if you have good reasons to do
otherwise :-)

Regards,
Jo

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Precedence Rules for '$' and '^'

Joachim Durchholz <jo@durchholz.org>Sat, 15 Sep 2007 22:41:25 +0200

Joachim Durchholz <jo@durchholz.org>
Sat, 15 Sep 2007 22:41:25 +0200