Re: Parallel lexers by chunking the input

gah4 <gah4@u.washington.edu>
Fri, 25 Mar 2022 07:58:04 -0700 (PDT)

          From comp.compilers

Related articles
Parallel lexers by chunking the input christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-03-24)
Re: Parallel lexers by chunking the input anton@mips.complang.tuwien.ac.at (2022-03-25)
Re: Parallel lexers by chunking the input gah4@u.washington.edu (gah4) (2022-03-25)
Re: Parallel lexers by chunking the input anton@mips.complang.tuwien.ac.at (2022-03-25)
Parallel lexers by chunking the input christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-03-26)
Re: Parallel lexers by chunking the input anton@mips.complang.tuwien.ac.at (2022-03-26)
| List of all articles for this month |

From: gah4 <gah4@u.washington.edu>
Newsgroups: comp.compilers
Date: Fri, 25 Mar 2022 07:58:04 -0700 (PDT)
Organization: Compilers Central
References: 22-03-058 22-03-064 22-03-065
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="48623"; mail-complaints-to="abuse@iecc.com"
Keywords: lex, parallel
Posted-Date: 25 Mar 2022 10:59:33 EDT
In-Reply-To: 22-03-065

On Friday, March 25, 2022 at 4:45:41 AM UTC-7, Anton Ertl wrote:


(snip)


> Yes. But you can represent this uncertainty in the state of the DFA.
> Essentially you can take the original DFA, and make all states of that
> (nondeterministic) start states of our continuation DFA. Process the
> pieces of the input in parallel with the continuation automaton.


At some point, it is Aho-Corasick algorithm.


That works well if you are looking for a large number of things,
and you don't know where they start. The DFA needs one bit,
in addition to the next state indicator, to indicate that you
found something. A favorite implementation is the low bit
of a pointer on a byte addressed system, which will normally
be zero.


I suspect this might be used for malware search programs,
which look for any of a large list of matching strings though a
large file. Not quite as useful for programming language
scanners, though.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.