Related articles |
---|
[4 earlier articles] |
Re: re2c-1.0 released! gneuner2@comcast.net (George Neuner) (2017-09-02) |
Re: re2c-1.0 released! skvadrik@gmail.com (Ulya Trofimovich) (2017-09-03) |
Re: re2c-1.0 released! jamin.hanson@googlemail.com (Ben Hanson) (2017-09-03) |
Re: re2c-1.0 released! jamin.hanson@googlemail.com (Ben Hanson) (2017-09-03) |
Re: re2c-1.0 released! 398-816-0742@kylheku.com (Kaz Kylheku) (2017-09-03) |
Re: re2c-1.0 released! skvadrik@gmail.com (Ulya Trofimovich) (2017-09-03) |
Re: re2c-1.0 released! skvadrik@gmail.com (Ulya Trofimovich) (2017-09-03) |
Re: re2c-1.0 released! jamin.hanson@googlemail.com (Ben Hanson) (2017-09-04) |
Re: re2c-1.0 released! skvadrik@gmail.com (Ulya Trofimovich) (2017-09-08) |
From: | Ulya Trofimovich <skvadrik@gmail.com> |
Newsgroups: | comp.compilers |
Date: | Sun, 3 Sep 2017 18:31:39 +0100 |
Organization: | Compilers Central |
References: | 17-08-007 17-09-001 17-09-003 |
Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="55066"; mail-complaints-to="abuse@iecc.com" |
Keywords: | lex |
Posted-Date: | 03 Sep 2017 13:44:54 EDT |
Content-Language: | en-US |
> Thanks for this. I will take a look at the implementation when I get
> the time.
>
> I'm most interested in your refinements/bug fixes to the original
> TDFA algorithms and I would like to see this used as a runtime regex
> library more than for a lexer generator.
Using TDFA in a runtime library has certain downsides:
1. It implies the runtime overhead on (lazy) determinization: it's only
useful if the regexp is compiled once and used many times. Libraries
such as RE2 (by Russ Cox) or TRE (by Ville Laurikari) mostly use NFA for
submatching.
2. Also, due to ahead-of-time compilation lexer generators are able to
perform extensive optimization of the generated code, including
minimization of the number of submatch variables and operations on them.
A runtime library cannot afford that.
However, if TDFA algorithm is to be used in a library, then my paper
contains two improvements worth consideration:
1. TDFA(1) with one-symbol lookahead instead of default TDFA(0)
2. Formal algorithm for POSIX disambiguation semantics.
I'd be glad to answer any questions anyway.
Ulya
Return to the
comp.compilers page.
Search the
comp.compilers archives again.