Re: Regular expression string searching & matching

Clint O <clint.olsen@gmail.com>
Thu, 22 Mar 2018 17:46:03 GMT

          From comp.compilers

Related articles
[9 earlier articles]
Re: Regular expression string searching & matching DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2018-03-13)
Re: Regular expression string searching & matching jamin.hanson@googlemail.com (Ben Hanson) (2018-03-13)
Re: Regular expression string searching & matching jamin.hanson@googlemail.com (Ben Hanson) (2018-03-13)
Re: Regular expression string searching & matching clint.olsen@gmail.com (Clint O) (2018-03-17)
Re: Regular expression string searching & matching clint.olsen@gmail.com (Clint O) (2018-03-18)
Re: Regular expression string searching & matching clint.olsen@gmail.com (Clint O) (2018-03-20)
Re: Regular expression string searching & matching clint.olsen@gmail.com (Clint O) (2018-03-22)
| List of all articles for this month |

From: Clint O <clint.olsen@gmail.com>
Newsgroups: comp.compilers
Date: Thu, 22 Mar 2018 17:46:03 GMT
Organization: Newshosting.com - Highest quality at a great price! www.newshosting.com
References: 18-03-016 18-03-032 18-03-034 18-03-035 18-03-041 18-03-045 18-03-054 18-03-087
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="95950"; mail-complaints-to="abuse@iecc.com"
Keywords: lex, DFA
Posted-Date: 22 Mar 2018 20:43:53 EDT

On 2018-03-20, Clint O <clint.olsen@gmail.com> wrote:
> [ reposted to try to make the special characters look right ]
>
> q0: /·[*]·([^*] | [*]+·[^/])*·[*]+·/
> [/] q2
> ['\x00'-.0-ÿ] q1
> q1: ∅
> ['\x00'-ÿ] q1
> q2: [*]·([^*] | [*]+·[^/])*·[*]+·/
> [*] q3
> ['\x00'-)+-ÿ] q1
> q3: ([^*] | [*]+·[^/])*·[*]+·/
> [*] q4
> ['\x00'-)+-ÿ] q3
> q4: ([*]*·[^/]·([^*] | [*]+·[^/])*·[*]+ | [*]*)·/
> [*] q6
> ['\x00'-)+-.0-ÿ] q3
> [/] q5
> q5: ε
> ['\x00'-ÿ] q1
> q6: (([*]*·[^/] | ε)·([^*] | [*]+·[^/])*·[*]+ | [*]*)·/
> [*] q8
> ['\x00'-)+-.0-ÿ] q3
> [/] q7
> q7: ([^*] | [*]+·[^/])*·[*]+·/ | ε
> [*] q4
> ['\x00'-)+-ÿ] q3
> q8: ((([*]*·[^/] | ε)·([^*] | [*]+·[^/])* | [*]*·[^/]·([^*] | [*]+·[^/])*)·[*]+ | [*]*)·/
> [*] q8
> ['\x00'-)+-.0-ÿ] q3
> [/] q7


Thanks John for reposting this. It looks much better now.


In summary:


q5,q7 are accepting states since they contain epsilon. q2 represents the
error state.


The key to success with this algorithm is recognizing previously calculated
derivatives/expressions. When you no longer calculate unique derivatives,
the DFA construction terminates. As you can see the expressions can get
hairy pretty quickly. I don't know if you can glean much from the
successive expressions generated. It's akin to the method of walking the
parse tree but it bypasses the NFA construction entirely.


Thanks,


-Clint


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.