Re: is lex useful? (Scott Nicol)
24 Jun 1996 15:06:39 -0400

          From comp.compilers

Related articles
is lex useful? (Dan E. Kelley) (1996-06-21)
Re: is lex useful? (1996-06-23)
Re: is lex useful? (Ronald Kanagy) (1996-06-23)
Re: is lex useful? (1996-06-24)
Re: is lex useful? kelley@Phys.Ocean.Dal.Ca (1996-06-24)
Re: is lex useful? (1996-06-24)
Re: is lex useful? (1996-06-24)
Re: is lex useful? (1996-06-26)
Re: is lex useful? (1996-06-26)
Re: is lex useful? (Stefan Monnier) (1996-06-26)
Re: is lex useful? (1996-06-26)
Re: is lex useful? (1996-06-26)
[17 later articles]
| List of all articles for this month |

From: (Scott Nicol)
Newsgroups: comp.compilers
Date: 24 Jun 1996 15:06:39 -0400
Organization: Information Advantage
References: 96-06-073
Keywords: lex says...
>So, my question is: if lex is useful, why isn't it used? Is there
>some snag (speed problems, perhaps, or difficult to port code?) that
>makes it smart to avoid lex?

The most common complaint is speed, but I don't think that is relevent in
most applications. With a typical compiler (especially an optimizing
compiler), scanning is among the cheapest components, consuming perhaps 5% to
10% of the total compile time. Even if you bring the scan time down to 0,
you have only speeded up the compiler 10%.

Here are the things that I don't like about Lex. These are based on using
many different versions of Lex, including Flex, and also supporting (bug
fixes, etc) Lex for a company that sells a commercial version of Lex. I
will admit that I haven't really worked with any version of Lex within the
last 4 years, so things could have changed.

- Lots of fixed limits. The biggest one being yytext[], which by default is
    fixed at 200? bytes. You can change the fixed limit to another size, but
    it is still fixed.

- Many nice features of Lex undocumented (i.e. line number), and Flex doesn't
    support them.

- Generated scanner is hard-coded to one character set. Not very useful
    in a "global" environment.

- Lots of globals, making re-entrancy (or even multiple scanners in a single
    program) difficult.

- No support for wide (>8 bit) character sets. Even 8-bit support is
    fairly recent. The obvious implementation for wide characters (expand
    tables to 16 bits) isn't practical, because you would increase the tables
    sizes (which are already huge) 256x.

- Parser-scanner interactions can get really hairy (a common way to fix
    difficult parsing problems is to have the parser fiddle with the scanner,
    so the scanner will handle it).

On top of all these things, it is really easy to hand-write a scanner that
does all of these things (and more), and it won't take you much more time
than writing a Lex scanner. I have also probably missed a bunch of other
serious deficiencies.

Scott Nicol
Information Advantage, Inc.
[Flex fixes the fixed array problem, and it can finally produce lexers which
are C++ classes, somewhat helping the reentrancy problem. It's still no good
on variable or wide character sets, and is no better on parser/lexer kludges
than any other lexer. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.