table-driven scanners strike back

Thu, 29 Jan 87 00:29:27 EST

          From comp.compilers

Related articles
table-driven scanners strike back ihnp4!utzoo!henry (1987-01-29)
| List of all articles for this month |

Date: Thu, 29 Jan 87 00:29:27 EST
From: ihnp4!utzoo!henry

There has been a considerable body of folklore building up lately to the
effect that table-driven scanners are inherently slow. The Waite and
Heuring SP&E papers have helped to establish this belief. At Usenix,
Van Jacobson of LBL shot it down in flames.

"The Real Time Systems Group at LBL has used table-driven finite automata
to construct some of the world's fastest data acquisition and control
systems. We had doubts about the 'inherently slow' claim and were curious
why Lex went so slow. On investigation, we found a number of implementation
problems but nothing that looked fundamental..."

He spent about a week investigating Lex and tuning it for higher performance.
The result is an order of magnitude faster than the current Lex and is close
to a factor of two faster than the best hand-tuned scanners using all of
Waite's techniques. The speed of yylex() is within a factor of 2-3 of the
speed of getc() in simple cases. As a useful side effect, the new Lex's
tables are half the size of the old ones.

Part of the improvement was due to reorganizing Lex's tables to make the
inner loop of the scanner run faster. The other major item was making Lex
smarter, so that it generated the minimal inner loop needed for the specific
grammar, rather than always generating the worst-case inner loop. The final
version of the inner loop for straightforward cases (e.g. most programming
languages) is two instructions on a 68020.

The above is reported from Van's abstract and his talk. His paper didn't
get into the proceedings because of clearance problems; he said that he will
post it to mod.sources. He's also looking at posting diffs for the code,
although it will take some sorting out to avoid licensing violations.

Henry Spencer @ U of Toronto Zoology
[My impression of lex is that it was an experiment that pretty much worked,
but that nobody heretofore had cleaned it up to make it work well. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.