Parsing using a Graphics Processing Unit (GPU)?

Christopher F Clark <christopher.f.clark@compiler-resources.com>
Wed, 2 Sep 2020 01:14:06 +0300

          From comp.compilers

Related articles
Parsing using a Graphics Processing Unit (GPU)? costello@mitre.org (Roger L Costello) (2020-08-31)
Re: Parsing using a Graphics Processing Unit (GPU)? auriocus@gmx.de (Christian Gollwitzer) (2020-09-01)
Re: Parsing using a Graphics Processing Unit (GPU)? minforth@arcor.de (A. K.) (2020-09-01)
Re: Parsing using a Graphics Processing Unit (GPU)? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2020-09-01)
Parsing using a Graphics Processing Unit (GPU)? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2020-09-02)
Re: Parsing using a Graphics Processing Unit (GPU)? elronnd@elronnd.net (Elijah Stone) (2020-09-01)
Re: Parsing using a Graphics Processing Unit (GPU)? arnold@skeeve.com (2020-09-02)
Re: Parsing using a Graphics Processing Unit (GPU)? 0xe2.0x9a.0x9b@gmail.com (Jan Ziak) (2020-09-02)
Re: Parsing using a Graphics Processing Unit (GPU)? costello@mitre.org (Roger L Costello) (2020-09-02)
Re: Parsing using a Graphics Processing Unit (GPU)? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2020-09-02)
Re: Parsing using a Graphics Processing Unit (GPU)? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2020-09-02)
[2 later articles]
| List of all articles for this month |

From: Christopher F Clark <christopher.f.clark@compiler-resources.com>
Newsgroups: comp.compilers
Date: Wed, 2 Sep 2020 01:14:06 +0300
Organization: Compilers Central
References: 20-09-001 20-09-002
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="3620"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, performance
Posted-Date: 01 Sep 2020 18:27:16 EDT

Our esteemed moderator wrote:
[Parsing is not usually an important factor in compiler performance.
The slow parts are the lexer, because it has to look at every
character of the input, and some optimizations that have to analyze
the entire intermediate form of the program. The first step in lexing
is to identify what class each character is, e.g., identifier, white
space, or operator. Perhaps a GPU could do vector lookups to speed
that up. For optimizations, I can sort of imagine how some analyses
like reachability might be expressible as matrices. -John]


As to parallelizing lexers, we did some experiments on speeding up
regular expressions (mostly for Snort) while I was at Intel. Charlie
Lasswell did experiments on starting parallel scans on different parts
of a stream or file that showed some promise. The hyperscan team (aka
Sensory Networks, Geoff Langdale &co) had some interesting results on
when you could skip ahead and guess the correct state when using
NFAs--I believe they called it "and then a miracle happens" (following
the famous cartoon about a mathematical proof with some missing steps
on a whiteboard). Michela Becchi worked on when you could predict
states involved in xFA loops would terminate. Geoff had me do a mode
they called Maclellan (the HS folks had some interesting naming
conventions, one of them was the use of US Civil War generals for
major variations/modes) which was turning NFAs into parallelized DFAs
for Hyperscan. The regular expression engine we put into the Cave
Creek chip, did a different form of parallel DFAs, spawning them when
certain conditions were met, and for things like the Aho-Corasick
algorithm you can show that there tends to be a "bushy" part near the
head of the automata, with "skinny" parts being forked off after a
small number of bytes--that was the key insight that led to our chip.


So, there are definite possibilities for doing SIMD regular expression
matching that would be suitable for GPU implementation. Geoff and I
discussed how to approach it on numerous occasions. It just never
quite got past the to-consider list.


Some of this "could" also be applied to parsing, particularly in a GLR
style parser, where you have forests. But our moderator is correct in
saying the big boost would be in speeding up lexing.


--
******************************************************************************
Chris Clark email: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris
------------------------------------------------------------------------------


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.