|table-driven scanners strike back ihnp4!utzoo!henry (1987-01-29)|
|Date:||Thu, 29 Jan 87 00:29:27 EST|
There has been a considerable body of folklore building up lately to the
effect that table-driven scanners are inherently slow. The Waite and
Heuring SP&E papers have helped to establish this belief. At Usenix,
Van Jacobson of LBL shot it down in flames.
"The Real Time Systems Group at LBL has used table-driven finite automata
to construct some of the world's fastest data acquisition and control
systems. We had doubts about the 'inherently slow' claim and were curious
why Lex went so slow. On investigation, we found a number of implementation
problems but nothing that looked fundamental..."
He spent about a week investigating Lex and tuning it for higher performance.
The result is an order of magnitude faster than the current Lex and is close
to a factor of two faster than the best hand-tuned scanners using all of
Waite's techniques. The speed of yylex() is within a factor of 2-3 of the
speed of getc() in simple cases. As a useful side effect, the new Lex's
tables are half the size of the old ones.
Part of the improvement was due to reorganizing Lex's tables to make the
inner loop of the scanner run faster. The other major item was making Lex
smarter, so that it generated the minimal inner loop needed for the specific
grammar, rather than always generating the worst-case inner loop. The final
version of the inner loop for straightforward cases (e.g. most programming
languages) is two instructions on a 68020.
The above is reported from Van's abstract and his talk. His paper didn't
get into the proceedings because of clearance problems; he said that he will
post it to mod.sources. He's also looking at posting diffs for the code,
although it will take some sorting out to avoid licensing violations.
Henry Spencer @ U of Toronto Zoology
[My impression of lex is that it was an experiment that pretty much worked,
but that nobody heretofore had cleaned it up to make it work well. -John]
Return to the
Search the comp.compilers archives again.