Related articles |
---|
Parallel lexers by chunking the input christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-03-24) |
Re: Parallel lexers by chunking the input anton@mips.complang.tuwien.ac.at (2022-03-25) |
Re: Parallel lexers by chunking the input gah4@u.washington.edu (gah4) (2022-03-25) |
Re: Parallel lexers by chunking the input anton@mips.complang.tuwien.ac.at (2022-03-25) |
Parallel lexers by chunking the input. christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-03-26) |
Parallel lexers by chunking the input christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-03-26) |
Re: Parallel lexers by chunking the input anton@mips.complang.tuwien.ac.at (2022-03-26) |
From: | anton@mips.complang.tuwien.ac.at (Anton Ertl) |
Newsgroups: | comp.compilers |
Date: | Fri, 25 Mar 2022 08:04:59 GMT |
Organization: | Institut fuer Computersprachen, Technische Universitaet Wien |
References: | 22-03-058 22-03-064 |
Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="97958"; mail-complaints-to="abuse@iecc.com" |
Keywords: | lex, parallel, comment |
Posted-Date: | 25 Mar 2022 07:45:38 EDT |
Christopher F Clark <christopher.f.clark@compiler-resources.com> writes:
>But, the real issue is strings, comments, and markdown like features.
>They all suffer from the same issue, which is that they are generally
>unbounded strings that start and stop with very short sequences. And,
>by picking some arbitrary point in the future to start scanning, you
>cannot be sure whether you are inside or outside one of those
>features.
Yes. But you can represent this uncertainty in the state of the DFA.
Essentially you can take the original DFA, and make all states of that
(nondeterministic) start states of our continuation DFA. Process the
pieces of the input in parallel with the continuation automaton.
Now you know the end state of the original automaton the
first piece, and the end state of the continuation automaton of the
second piece; you can look up the end state of the original automaton
for the second piece from these informations. And so on for the
further pieces. This is a sequential component, but it is fast.
Now you can reprocess the second-to-last pieces in parallel using the
original automaton and performing the actions (e.g., converting
numbers into binary representation).
Not sure how to interface that to the parser, but if we have a similar
idea for parallelizing the parser, even the lex/flex interface might
be ok.
- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/
[Experiments would be useful here. Yes, you can do this, but do you
end up throwing away so much work that it's not worth it? -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.