Re: State-of-the-art algorithms for lexical analysis?

gah4 <gah4@u.washington.edu>
Sun, 5 Jun 2022 16:05:38 -0700 (PDT)

          From comp.compilers

Related articles
State-of-the-art algorithms for lexical analysis? costello@mitre.org (Roger L Costello) (2022-06-05)
Re: State-of-the-art algorithms for lexical analysis? gah4@u.washington.edu (gah4) (2022-06-05)
Re: State-of-the-art algorithms for lexical analysis? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2022-06-06)
Re: State-of-the-art algorithms for lexical analysis? costello@mitre.org (Roger L Costello) (2022-06-06)
Re: State-of-the-art algorithms for lexical analysis? 480-992-1380@kylheku.com (Kaz Kylheku) (2022-06-06)
Re: State-of-the-art algorithms for lexical analysis? gah4@u.washington.edu (gah4) (2022-06-06)
References for PSL ? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-06-06)
State-of-the-art algorithms for lexical analysis? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-06-06)
[7 later articles]
| List of all articles for this month |

From: gah4 <gah4@u.washington.edu>
Newsgroups: comp.compilers
Date: Sun, 5 Jun 2022 16:05:38 -0700 (PDT)
Organization: Compilers Central
References: 22-06-006
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="46525"; mail-complaints-to="abuse@iecc.com"
Keywords: lex
Posted-Date: 05 Jun 2022 21:11:12 EDT
In-Reply-To: 22-06-006

On Sunday, June 5, 2022 at 2:08:12 PM UTC-7, Roger L Costello wrote:


(snip)


> Are regular expressions still the best way to specify tokens?


Some years ago, I used to work with a company that sold hardware
search processors to a certain three letter agency that we are not
supposed to mention, but everyone knows.


It has a completely different PSL, Pattern Specification Language,
much more powerful than the usual regular expression.


Both the standard and extended regular expression are nice, in that we
get used to using them, especially with grep, and without thinking too
much about them.


I suspect, though, that if they hadn't previously been defined, we
might come up with something different today.


Among others, PSL has the ability to define approximate matches,
such as a word with one or more misspellings, that is insertions,
deletions, or substitutions. Usual RE don't have that ability.


There are also PSL expressions for ranges of numbers.
You can often do that with very complicated RE, considering
all of the possibilities. PSL automatically processes those
possibilities. (Some can expand to complicated code.)


I suspect that in many cases the usual RE is not optimal for
lexical analysis, other than being well known.


But as noted, DFA are likely the best way to do them.


Though that could change with changes in computer hardware.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.