Re: Alternative Syntax for Regular Expressions?

spinoza1111@yahoo.com (Edward G. Nilges)
20 Oct 2001 21:30:31 -0400

          From comp.compilers

Related articles
[6 earlier articles]
Re: Alternative Syntax for Regular Expressions? alexc@world.std.com (2001-10-13)
Re: Alternative Syntax for Regular Expressions? rboland@unb.ca (Ralph Boland) (2001-10-13)
Re: Alternative Syntax for Regular Expressions? spinoza1111@yahoo.com (2001-10-14)
Re: Alternative Syntax for Regular Expressions? eanders@cs.berkeley.edu (2001-10-16)
Re: Alternative Syntax for Regular Expressions? ralph@inputplus.demon.co.uk (2001-10-16)
Re: Alternative Syntax for Regular Expressions? spinoza1111@yahoo.com (2001-10-20)
Re: Alternative Syntax for Regular Expressions? spinoza1111@yahoo.com (2001-10-20)
Re: Alternative Syntax for Regular Expressions? spinoza1111@yahoo.com (2001-10-20)
| List of all articles for this month |
From: spinoza1111@yahoo.com (Edward G. Nilges)
Newsgroups: comp.compilers
Date: 20 Oct 2001 21:30:31 -0400
Organization: http://groups.google.com/
References: 01-10-029 01-10-072 01-10-081
Keywords: lex
Posted-Date: 20 Oct 2001 21:30:31 EDT

ralph@inputplus.demon.co.uk (Ralph Corderoy) wrote in message news:01-10-081...
> Hi Edward,
>
> > In Hopcroft and Ullman's 1973 book FORMAL LANGUAGES AND THEIR
> > RELATION TO AUTOMATA they were among the first to reveal the
> > discovery that regular expressions corresponded to a particular type
> > of language, "Chomsky Type 0" which they named in honor of MIT's
> > Noam Chomsky who is both a pioneer in linguistics and a political
> > gadfly.
>
> Isn't Type 3 the regular grammar under Chomsky's classification?


Oops. You are right. I worked from memory and got the numeric
sequence backwards. My apologies. I meant the most restrictive type
and the numbering scheme is arbitrary.


> > Backus-Naur grammars are more readable, by several orders of
> > magnitude, than regular expressions.
>
> Just change the regular expression syntax. Perl has done this. So's
> Python. And lex gave names to parts of patterns.
>
> > ^(\([0-9]{3}\)[ ]{1}){0,1}[0-9]{3}\-[0-9]{4}$ Yecchhhh
>
> Just using some of these changes gives
>
> ^(\(\d\d\d\) )?\d\d\d-\d\d\d\d$
>
> Why use a character class for the single space? Why suffix that class
> with {1} which is redundant? Why use {0,1} instead of ?? Why escape
> the dash?
>
> You've made it more cluttered than necessary.
>
> > phoneNumber := STARTOFINPUT phoneNumberBody ENDOFINPUT
> > phoneNumberBody := localPhoneNumber
> > phoneNumberBody := areaCode SPACE localPhoneNumber
> > areaCode := [0-9]{3}
> > localPhoneNumber := prefix DASH suffix
> > prefix := [0-9]{3}
> > suffix := [0-9]{4}
>
> Some of those lines could be done exactly the same way in lex as I
> mentioned earlier.


The critical difference is that there are context-free and
context-sensitive languages that regular expressions cannot parse,
including arbitrary numbers of balanced, nested parentheses. The fact
that you can make the regular expression readable using perl tricks is
unavailable to me in VB.Net. Advising me to work in perl is like
advising me to move to France: attractive but has a nonzero cost.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.