Related articles |
---|
[9 earlier articles] |
Re: regular expressions wendt@CS.ColoState.EDU (1993-03-22) |
Regular Expressions rafae1@hp.fciencias.unam.mx (trejo ortiz alejandro augusto) (1995-10-16) |
Re: Regular Expressions mnp@compass-da.com (Mitchell Perilstein) (1995-10-23) |
Re: Regular Expressions cgh@cs.rice.edu (1995-10-29) |
Re: Regular Expressions odunlain@maths.tcd.ie (Colm O'Dunlaing) (1995-10-31) |
Re: Regular Expressions natasha@softlab.ece.ntua.gr (1995-11-03) |
Re: Regular Expressions sjmccaug@prairienet.org (1995-11-28) |
Re: Regular Expressions jmccarty@spdmail.spd.dsccc.com (1995-11-29) |
Newsgroups: | comp.compilers |
From: | sjmccaug@prairienet.org (Scott J. McCaughrin) |
Keywords: | DFA |
Organization: | University of Illinois at Urbana |
References: | 95-11-033 95-10-087 |
Date: | Tue, 28 Nov 1995 08:26:05 GMT |
In a previous article, odunlain@maths.tcd.ie (Colm O'Dunlaing) says:
>1) Is there a canonical form for regular expressions(over a finite alphabet)?
> Yes, sort of. Convert a given regular expression R to a
> nondeterministic finite automaton N, then convert N to a
> determininistic finite automaton D, then apply minimisation to
> D to construct a minimal d.f.a. M. The automaton M is unique
> for the regular set it accepts, so M can be taken as a canonical
> form for the regular expression (I wouldn't bother about converting
> M back to a regular expression).
This sounds more like a recipe for building the recognizer. So far as r.e.'s
per se are conscerned, there is no such stipulation as "canonical form".
> Conversion from N to D may increase the number of states
> exponentially. So the canonical form could be very
> large. It is almost certain that any other choice of
> canonical form could also be very large, since the
> problem: Are R and S equivalent? (where R and S are
> regular expressions) is PSPACE complete.
>
> See
> 'Introduction to automata theory, languages, and computation,'
> by Hopcroft and Ullman.
In any event, this is certainly not how r.e. recognizers are built in
practice.
>3) An ambiguous grammar for regular expressions over the alphabet {a, b}is the
>following:
> R::=RR | R + R | R* | (R) |a|b
>The question is: How can I state an unambiguous grammar for regular
>expressions?
>
> (You probably need two more symbols \emptyset and \lambda
> for the empty set and the empty string. However...)
Well, \emptyset can be provably excluded from the set recognized by a
d.f.s.a. (in fact, even from the n.f.s.a.)
>
> I think, but am not absolutely certain, that the following
> grammar will work. Implicitly, it prioritises operators
> in the order * > concatenation > +, and orders evaluation
> from right to left (I think).
>
No it doesn't.
> A::= (R)|a|b
> B::=A*|B*
> C::=AR|BR
> D::=A+R|B+R|C+R
> R::=A|B|C|D
>
How can it prioritize operators if B,C,D are all top-level yet each
denotes a different operation? For starters, you need the start-symbol
(R) to produce only right-hand sides involving the lowest-priority ops.
But hey -- nice try!
-- Scott McC.
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.