Related articles |
---|
is lex useful? kelley@phys.ocean.dal.ca (Dan E. Kelley) (1996-06-21) |
Re: is lex useful? qjackson@direct.ca (1996-06-23) |
Re: is lex useful? rkanagy@erols.com (Ronald Kanagy) (1996-06-23) |
Re: is lex useful? qjackson@direct.ca (1996-06-24) |
Re: is lex useful? kelley@Phys.Ocean.Dal.Ca (1996-06-24) |
Re: is lex useful? Scott.Nicol@infoadvan.com (1996-06-24) |
Re: is lex useful? kanze@lts.sel.alcatel.de (1996-06-24) |
Re: is lex useful? bos@serpentine.com (1996-06-26) |
Re: is lex useful? dhami@mdd.comm.mot.com (1996-06-26) |
Re: is lex useful? stefan.monnier@lia.di.epfl.ch (Stefan Monnier) (1996-06-26) |
[19 later articles] |
From: | qjackson@direct.ca (Quinn Tyler Jackson) |
Newsgroups: | comp.compilers |
Date: | 24 Jun 1996 11:03:40 -0400 |
Organization: | Compilers Central |
Keywords: | lex, performance |
On 23 Jun 1996 23:24:53 -0400, Ronald Kanagy wrote:
>Lex is good in situations where a language is still being designed and a
>scanner is to be quickly built. But, in production compilers, after the
>language has be designed and stable, lex scanners tend to be too slow
>compared to hand-coded scanners and is unacceptable. Therefore, one would
>normally find hand-coded scanners in these situations.
>[Has anyone actually timed a flex scanner vs. a hand-coded one? -John]
Not flex, but I did some timings of an LPM scanner generated at
run-time vs. a handcoded version, and found the LPM scanner to be 25
times faster. I suspect (but have not verifed) that lex type scanner
would beat even that.
I will eventually, when it comes time to optimize CLpm, be doing
benches against lex, flex, and Spencer's regexp.c, using a suite of
about 20 RE's. Something must first be proven "correct" before it is
made faster, however. ;-)
One point to note that hasn't been mentioned in the hand-vs-generated
lexical scanner debate is that hand-coded scanners tend to read like
nightmares. In one CLpm demo that scans a file for legal URL's, it
takes 55+ lines of C++ code to do what is accomplished in two lines of
CLpm semantics. Granted, there are 10,000 lines of interpreter
underneath those 2 lines, but now that CLpm is a class and everything
is effectively under the hood, it is much more simple to express
tokens in terms of patterns than in terms of raw C++. It also tends
to be easier to debug patterns than to go through a hand-crafted
scanner looking for a glitch. Moreover, it becomes much simpler to
optimize one central interpreter/generator than to surf through three
hundred swith/if/while statements looking for places to trim the fat.
I've implemented both types, and prefer generated/interpreted scanners
over hand-written ones any day.
Cheers,
Quinn
--
Parsepolis Software || Quinn Tyler Jackson
"ParseCity" || qjackson@direct.ca
>------ http://mypage.direct.ca/q/qjackson/ ------>
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.