Re: Regular expression string searching & matching

Clint O <clint.olsen@gmail.com>
Thu, 8 Mar 2018 22:53:37 -0800 (PST)

          From comp.compilers

Related articles
Regular expression string searching & matching clint.olsen@gmail.com (Clint O) (2018-03-04)
Re: Regular expression string searching & matching jamin.hanson@googlemail.com (Ben Hanson) (2018-03-07)
Re: Regular expression string searching & matching jamin.hanson@googlemail.com (Ben Hanson) (2018-03-07)
Re: Regular expression string searching & matching clint.olsen@gmail.com (Clint O) (2018-03-08)
Re: Regular expression string searching & matching clint.olsen@gmail.com (Clint O) (2018-03-10)
Re: Regular expression string searching & matching jamin.hanson@googlemail.com (Ben Hanson) (2018-03-10)
Re: Regular expression string searching & matching jamin.hanson@googlemail.com (Ben Hanson) (2018-03-11)
Re: Regular expression string searching & matching clint.olsen@gmail.com (Clint O) (2018-03-12)
Re: Regular expression string searching & matching jamin.hanson@googlemail.com (Ben Hanson) (2018-03-12)
Re: Regular expression string searching & matching DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2018-03-13)
[6 later articles]
| List of all articles for this month |

From: Clint O <clint.olsen@gmail.com>
Newsgroups: comp.compilers
Date: Thu, 8 Mar 2018 22:53:37 -0800 (PST)
Organization: Compilers Central
References: 18-03-016 18-03-032
Injection-Date: Fri, 09 Mar 2018 06:53:37 +0000
Keywords: lex, DFA, comment

Hi Ben:


Thanks for your post. I did try your regular expression (and a few small
variations on it), but it exhibits the same behavior as the others I have
tried.


The difference with the complement version is that the accepting state I end
up with has all transitions to the error state (which guarantees termination
after match) where as these seem to still accept characters even after
matching the closing '*/'. It's possible I have a bug in my implementation, so
I'm still looking at it.


Thanks,


-Clint


On Wednesday, March 7, 2018 at 11:59:10 AM UTC-8, Ben Hanson wrote:
> [/][*]([^*]|[*]+[^*/])*[*]+[/]
>
> is what you are looking for. I ran into this when developing my lexer
> generator library lexertl in C++. Having a debug::dump() function
> really helped me grok what was going on.
>
> The trick of course is realising that you have to exclude the
> characters that follow (i.e. the [^*/] part). That is the bit that
> clobbers the greedy behaviour. I've had to remind myself of that on
> more than one occasion recently!


[This should work, it's a standard example in compiler texts. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.