Related articles |
---|
Integer sizes and DFAs christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-03-26) |
Re: Integer sizes and DFAs 480-992-1380@kylheku.com (Kaz Kylheku) (2022-03-26) |
RE: Integer sizes and DFAs christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-03-27) |
Re: Integer sizes and DFAs gah4@u.washington.edu (gah4) (2022-03-26) |
Re: Integer sizes and DFAs gah4@u.washington.edu (gah4) (2022-03-26) |
RE: Integer sizes and DFAs christopher.f.clark@compiler-resources.com (Christopher F Clark) (2022-03-27) |
From: | gah4 <gah4@u.washington.edu> |
Newsgroups: | comp.compilers |
Date: | Sat, 26 Mar 2022 19:32:17 -0700 (PDT) |
Organization: | Compilers Central |
References: | 22-03-073 |
Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="97762"; mail-complaints-to="abuse@iecc.com" |
Keywords: | lex, performance |
Posted-Date: | 26 Mar 2022 22:39:33 EDT |
In-Reply-To: | 22-03-073 |
On Saturday, March 26, 2022 at 4:42:55 PM UTC-7, Christopher F Clark wrote:
(snip)
> And, my point was 2**32 is large enough to be considered arbitrarily large with
> respect to most DFAs. Not quite the human genome, see extended analysis
> below. Here was my first analysis.
About 24 years ago I was working with a DNA sequencing group, and was
interested in speeding up this problem. The one I was most interested in
was special purpose hardware with many of the largest DRAM I could find,
arranged just to do this operation.
(Note that you need one more bit, to indicate when a match is found.)
There would be logic to read data off disk, and pass it directly to the DFA
array. There is, then, logic to store the offset into the disk file, and the
state at which the hit occured, to be read out later.
But we went onto other projects, and I never got to build one.
Since then, DRAM has gotten much larger, but so has the DNA database.
Yes the human genome is 3 gigabase, but the whole of GenBank is
now about 16 terabase, including WGS (whole genome sequences).
Return to the
comp.compilers page.
Search the
comp.compilers archives again.