Re: Lex surrogates

wpl@PRC.Unisys.COM (William P Loftus)
6 Feb 89 16:55:16 GMT

From comp.compilers

Related articles
Lex surrogates lfcs.edinburgh.ac.uk!db@NSS.CS.UCL.AC.UK (Dave Berry) (1989-02-05)
Re: Lex surrogates schmidt@ORION.CF.UCI.EDU (Douglas C. Schmidt) (1989-02-05)
Re: Lex surrogates vern@pistachio.ee.lbl.gov (Vern Paxson) (1989-02-06)
Re: Lex surrogates rsalz@BBN.COM (Rich Salz) (1989-02-07)
*Re: Lex surrogates wpl@PRC.Unisys.COM* (1989-02-06)**
Re: Lex surrogates ken@cs.rochester.edu (Ken Yap) (1989-02-09)
Re: Lex surrogates mike@arizona.edu (1989-02-09)
Re: Lex surrogates tower@bu-cs.BU.EDU (1989-02-10)
Re: Lex surrogates pardo@june.cs.washington.edu (1989-02-11)
Re: Lex surrogates henry@zoo.toronto.edu (Henry Spencer) (1989-02-11)
Re: Lex surrogates holt@turing.toronto.edu (Ric Holt) (1989-02-13)
[2 later articles]

| List of all articles for this month |

From:	wpl@PRC.Unisys.COM (William P Loftus)
Newsgroups:	comp.compilers
Date:	6 Feb 89 16:55:16 GMT
References:	<3286@ima.ima.isc.com>

Many of the lexical analyzer generators gain their speed from limiting
the specification power of their tools. They are taking advantage of
the fact that most languages don't need a general RE processor. In my
opinion, this limits the use of such tools to traditional language
lexical analyzers (they are not for what Aho et al. call swiss army
knives---see the dragon book). However, there is a full RE lexical
analyzer generator called Flex.

Flex is a lex clone, but much faster.

Here is a file contained in the distribution of flex (which is
freely available from several ftp and uucp sites) that describes
timing:

--
flex vs. lex timings for a C tokenizer which includes keywords:

Generation times:

lex 83.0 secs
flex 3.9
flex -cfe 7.1 # uncompressed table, equivalence classes
flex -cf 15.0 # uncompressed table, no equivalence classes

Scanner object file sizes:

lex 41.0K bytes
flex 9.4K
flex -cfe 49.6K
flex -cf 126.5K

Running times on a 28,088 line input (685K characters):

lex 29.8 secs
flex 19.3
flex -cfe 9.0
flex -cf 7.8

The timings were made on a Sun 3/60. All times are user + system CPU time,
and don't include hashing of identifiers.

Summary:

        For about the same sized scanner, you get a factor of 3 in performance.
        For a 30% faster scanner, you get a scanner 1/4th the size, and it's
generated in 1/20th the time.
        For a scanner that's 3 times larger, you get a factor of 3.8 in
performance.
--

There is a paper by Vern Paxson that describes how he obtains these speeds (I
don't have the paper with me at the moment, so I can't tell you the title).

William Loftus
--
William P Loftus UUCP: wpl@burdvax.UUCP
Unisys/Paoli Research Center ARPA: wpl@anarchy.prc.unisys.com
PO Box 517 215-354-0614 (home)
Paoli, PA 19301 215-648-7248 (work)
--

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Lex surrogates

wpl@PRC.Unisys.COM (William P Loftus)6 Feb 89 16:55:16 GMT

wpl@PRC.Unisys.COM (William P Loftus)
6 Feb 89 16:55:16 GMT