Re: Compiler Tools v. C

Scott Nicol <scott@INFOADV.MN.ORG>
Fri, 12 May 1995 03:44:35 GMT

          From comp.compilers

Related articles
Compiler Tools v. C johnf@cpsc.ucalgary.ca (1995-04-28)
Re: Compiler Tools v. C scott@INFOADV.MN.ORG (Scott Nicol) (1995-04-29)
Re: Compiler Tools v. C shepherd@schubert.sbi.com (1995-05-04)
Re: Compiler Tools v. C scott@INFOADV.MN.ORG (Scott Nicol) (1995-05-12)
| List of all articles for this month |

Newsgroups: comp.compilers
From: Scott Nicol <scott@INFOADV.MN.ORG>
Keywords: tools, design, comment
Organization: Compilers Central
References: 95-05-045
Date: Fri, 12 May 1995 03:44:35 GMT

>>Lex, on the other hand, is not very useful. The man page for Lex at
>>Bell Labs has the following in the "BUGS" section:
>>
>> The asteriod to kill this dinosaur is still in orbit.
>
>That's the first time I've encountered this 'spin' on lex. Do you
>feel this way just about AT&T lex in particular, or about all "lex-like"
>programs (including Flex)?
>
>>If you want to anything the least bit tricky, it is worth it to spend a
>>few hours and write a hand-crafted scanner. The code isn't difficult,
>>and you gain speed, flexibility, and portability.
>
>The authors of the O'Reilly book argue the opposite--that a hand-crafted
>scanner will take you longer to write, may not be much faster, and will
>almost certainly be buggier.


I think the O'Reilly book is very good, but I don't agree with the
authors about lex (or flex). I can write a fast, portable, and
unbuggy (at least as "unbuggy" as lex-generated code) scanner of
reasonable complexity in C just as fast as I can in lex.


Writing a scanner using lex is fast and simple, until you need to work
around its limitations. Its been a few years since I seriously looked
at lex (I was paid to do that then...), but some common lex limitations
include:


- incompatabilities/missing features between different implementations.


- fixed input buffer (yytext) and pushback buffer (yysbuf) size, with no
    overflow checking (flex does?).


- generated code is not portable to machines with different character
    sets.


- no support for wide character sets. Even if lex did support wide
    characters, if the current table scheme were used, the generated
    scanner would be enormous (its too big already!).


These are just the first things that popped into my mind. I know it
is possible to work around these limitations, but in the time wasted
getting around these limitations you can easily produce a robust
hand-crafted scanner.


--
Scott Nicol email: scott@infoadv.mn.org
Information Advantage, Inc. work: (612) 820-3846
Edina, MN home: (612) 488-5406
[I agree that AT&T lex isn't very useful, but flex generates quite clean C
code, up to the character set limitations. -John]
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.