|Making Generated Scanners Faster. email@example.com (Stephen P. Butler) (1996-06-30)|
|Re: Making Generated Scanners Faster. firstname.lastname@example.org (Norman Culver) (1996-07-09)|
|From:||" Stephen P. Butler" <email@example.com>|
|Date:||30 Jun 1996 16:54:08 -0400|
I've been following the recent discussion on hand coded versus machine
generated scanners closely with a with certain amount of professional
Part the research I'm doing for my PhD is aimed at improving the
performance of the scanner for the RDP compiler-compiler. In
particular we'd like to produce a version which has the following
a) Programmable (unlike the scanner for RDP 1.X which can
only return a small but useful number of token types).
b) Integrated into the existing RDP syntax so that
describing the lexical structure of your language is
just an extension of describing the syntactic
structure. This would avoid the need to learn the
syntax of two different tools as is currently the
case if you're using a combination such as flex and
c) Can have semantic actions embedded into the lexical
analysis phase in exactly the same way as they can be
embedded into the syntax analysis phase.
d) Faster than, or at the very least, close enough to the
speed of hand coded scanners so that the temptation to
write these by hand and subsequently have to debug and
maintain them at this level is removed.
e) Maintain the current portability of RDP and it's
generated parsers and scanners such that they should
compile and work on any system that has sufficient
(hopefully still modest) resources and an ANSI/ISO C
conforming compiler and library.
As part of this work, I'm currently in the process of tracking down
as many scanner generators as I can find with the intention of doing
a comprehensive performance analysis on them. There are two
intentions in this work - i) To publish the information found to
allow further discussion and research/experimentation and ii) To
determine what makes certain of the generators faster or slower etc.
The ultimate aim of course is to use the results found to improve the
scanner for RDP.
Obviously, part of this exercise must try and determine the relative
performance of the machine generated scanners against carefully
written and optimised hand coded scanners - particularly since the
performance gains claimed between them and machine generated scanners
vary so much. Thus my motivation for posting to comp.compilers...
What I'd like to ask people are:
a) What "stunts" are people pulling in their hand coded
scanners that they feel make them faster than an
equivalent machine generated scanner.
b) Do you have a hand coded scanner for a language that
you feel particularly proud of and have carefully
optimised. Much of my work will be speeded up
considerably if people were willing to let me borrow
existing scanners for comparative testing since it'll
save me a lot of the time needed to write and optimise
my own and it's likely that these will be a better
representation of hand coded scanners.
Currently, I'm particularly interested in scanners for C,
Pascal and Oberon/Oberon 2. I'd also be interested in
scanners for other languages as well though, particularly
if you believe your optimisations are really good or the
lexical structure of the langauage may be a good test of
machine generated scanner's performance (Except perhaps
Thanks in advance for your help,
Stephen P. Butler. | Department of Computer Science.
(firstname.lastname@example.org) | Royal Holloway, University of London,
| Egham, Surrey TW20 0EX England.
Return to the
Search the comp.compilers archives again.